|
2024 |
Khalid Javeed, Ali El-Moursy, David Gregg |
E2CSM: Efficient FPGA implementation of elliptic curve scalar multiplication over generic prime field GF(p) Journal of Supercomputing, 80(1), pp50-74 |
|
2023 |
Syed Asad Alam, David Gregg, Giulio Gambardella, Thomas B. Preusser, Michaela Blott |
On the RTL Implementation of FINN Matrix Vector Unit ACM Transactions on Embedded Computing Systems, 22(6), pp94:1-94:27 |
|
2023 |
Khalid Javeed, Ali El-Moursy, David Gregg |
EC-Crypto: Highly Efficient Area-Delay Optimized Elliptic Curve Cryptography Processor IEEE Access, 11, pp56649-56662 |
|
2022 |
Khalid Javeed, Kamran Saeed, David Gregg |
High-speed parallel reconfigurable Fp multipliers for elliptic curve cryptography applications International Journal of Circuit Theory and Applications, 50(4), pp1160-1173 |
|
2022 |
Paul Biggar, David Gregg |
Building SSA in a Compiler for PHP SSA-based Compiler Design, pp347-357 |
|
2022 |
Muslim Chochlov, Gul Aftab Ahmed, James Vincent Patten, Guoxian Lu, Wei Hou, David Gregg, Jim Buckley |
Using a Nearest-Neighbour, BERT-Based Approach for Scalable Clone Detection IEEE International Conference on Software Maintenance and Evolution, pp582-591. DOI: https://doi.org/10.1109/ICSME55016.2022.00080 |
|
2022 |
Syed Asad Alam, Andrew Anderson, Barbara Barabasz, David Gregg |
Winograd Convolution for Deep Neural Networks: Efficient Point Selection ACM Transactions on Embedded Computing Systems, 21(6), pp80:1-80:28 |
|
2021 |
Kaveena Persand, Andrew Anderson, David Gregg |
Taxonomy of Saliency Metrics for Channel Pruning IEEE Access, 9, pp120110-120126. DOI: https://doi.org/10.1109/ACCESS.2021.3108545 |
|
2021 |
Syed Asad Alam, James Garland, David Gregg |
Low-precision Logarithmic Number Systems: Beyond Base-2 ACM Transactions on Architecture and Code Optimization, 18(4), pp47:1-47:25. DOI: https://doi.org/10.1145/3461699 |
|
2021 |
Kaveena Persand, Andrew Anderson, David Gregg |
Domino Saliency Metrics: Improving Existing Channel Saliency Metrics with Structural Information 20th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2021), 13196, pp447-461. DOI: https://doi.org/10.1007/978-3-031-08421-8_31 |
|
2020 |
Barabasz, Barbara and Anderson, Andrew and Soodhalter, Kirk M. and Gregg, David |
Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks ACM Trans. Math. Softw., 46(4), pp33. DOI: http://dx.doi.org/10.1145/3412380 |
|
2019 |
James Garland, David Gregg |
Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing High-performance Embedded Architecture and Compilation, 15(3), pp31:1-31:24 |
|
2019 |
Andrew Anderson, James Garland, Yuan Wen, Barbara Barabasz, Kaveena Persand, Aravind Vasudevan, David Gregg |
Hardware and software performance in deep learning Many-Core Computing: Hardware and Software, pp141-164 |
|
2019 |
Andrew Anderson, Michael Doyle, David Gregg |
Scalar Arithmetic Multiple Data: Customizable Precision for Deep Neural Networks 26th IEEE Symposium on Computer Arithmetic (ARITH 2019), pp61-68 |
|
2018 |
Anderson, A. and Gregg, D. |
Optimal DNN primitive selection with partitioned boolean quadratic programming , pp340-351. DOI: http://dx.doi.org/10.1145/3168805 |
|
2018 |
Garland, J. and Gregg, D. |
Low complexity multiply-accumulate units for convolutional neural networks with weight-sharing ACM Transactions on Architecture and Code Optimization, 15(3). DOI: http://dx.doi.org/10.1145/3233300 |
|
2018 |
James Garland, David Gregg |
Low Complexity Multiply-Accumulate Units for Convolutional Neural Networks with Weight-Sharing ACM Transactions on Architecture and Code Optimization, 15(3), pp31:1-31:24 |
|
2017 |
Anderson, A. and Muralidharan, S. and Gregg, D. |
Efficient Multibyte Floating Point Data Formats Using Vectorization IEEE Transactions on Computers, 66(12), pp2081-2096. DOI: http://dx.doi.org/10.1109/TC.2017.2716355 |
|
2017 |
Xu, S. and Gregg, D. |
Bitslice Vectors: A Software Approach to Customizable Data Precision on Processors with SIMD Extensions Proceedings of the International Conference on Parallel Processing(8025318), pp442-451. DOI: http://dx.doi.org/10.1109/ICPP.2017.53 |
|
2017 |
Vasudevan, A. and Anderson, A. and Gregg, D. |
Parallel Multi Channel convolution using General Matrix Multiplication IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)(7995254), pp19-24. DOI: http://dx.doi.org/10.1109/ASAP.2017.7995254 |
|
2017 |
James Garland, David Gregg |
Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks IEEE Computer Architecture Letters, 16(2), pp132-135 |
|
2016 |
Martin Marinov, Nicholas Nash and David Gregg |
Practical algorithms for finding extremal sets ACM Journal of Experimental Algorithmics |
|
2016 |
Roman Atachiants, Gavin Doherty and David Gregg |
Parallel performance problems on shared-memory multicore systems: a taxonomy and observation IEEE Transactions on Software Engineering, 42(8), pp764-785. DOI: http://doi.ieeecomputersociety.org/10.1109/TSE.2016.2519346 |
|
2016 |
Anderson, A., Gregg, D. |
Vectorization of Multibyte Floating Point Data Formats Parallel Architectures and Compilation Techniques, pp363-372. DOI: http://dx.doi.org/10.1145/2967938.2967966 |
|
2016 |
Xu S, Gregg D |
An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs , 2016-March, pp488-489. DOI: http://dx.doi.org/10.1109/PACT.2015.56 |
|
2016 |
Marinov, M., Nash, N., Gregg, D. |
Practical Algorithms for Finding Extremal Sets Journal of Experimental Algorithmics, 21(2), pp1.9-. DOI: http://dx.doi.org/10.1145/2893184 |
|
2016 |
Martin Marinov |
Practical Algorithms for Finding Extremal Sets Journal of Experimental Algorithmics |
|
2015 |
Avinash Malik and David Gregg |
Heuristics on Reachability Trees for Bicriteria Scheduling of Stream Graphs on Heterogeneous Multiprocessor Architectures ACM Transactions on Embedded Computing Systems, 14(2), pp23.1-23.26. DOI: http://dx.doi.org/10.1145/2638553 |
|
2015 |
Mircea Horea Ionica and David Gregg |
An evaluation of the suitability of the Movidius Myriad architecture for scientific computing IEEE Micro, 35(1), pp6-14. DOI: http://dx.doi.org/10.1109/MM.2015.4 |
|
2015 |
Andrew Anderson, Avinash Malik, David Gregg |
Automatic Vectorization of Interleaved Data Revisited ACM Transactions on Architecture and Code Optimization, 12(4), pp50-. DOI: http://dx.doi.org/10.1145/2838735 |
|
2015 |
Malik A, Gregg D |
Heuristics on reachability trees for bicriteria scheduling of stream graphs on heterogeneous multiprocessor architectures ACM Transactions on Embedded Computing Systems, 14(2), pp23-. DOI: http://dx.doi.org/10.1145/2638553 |
|
2015 |
Xu S, Gregg D |
Exploiting Hyper-Loop Parallelism in Vectorization to Improve Memory Performance on CUDA GPGPU , 3, pp53-60. DOI: http://dx.doi.org/10.1109/Trustcom.2015.612 |
|
2014 |
R. Atachiants, D. Gregg, K. Jarvis and G. Doherty |
Design Considerations for Parallel Performance Tools ACM Conference on Human Factors in Computing systems (CHI 2014), pp2501-2510. DOI: http://dx.doi.org/10.1145/2556288.2557350 |
|
2014 |
Xu,Shixiong S., Gregg,David D. |
Semi-automatic composition of data layout transformations for loop vectorization Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8707 LNCS, pp485-496. DOI: http://dx.doi.org/10.1007/978-3-662-44917-2_40 |
|
2014 |
Aravind Vasudevan, Quentin Bragard, Anthony Ventresque, Liam Murphy and David Gregg |
AN EVALUATION OF SPACE AND GRAPH-PARTITIONING METHODS FOR DISTRIBUTED ROAD NETWORK SIMULATIONS 2014 Winter Simulation Conference, pp4107-4108 |
|
2014 |
Aravind Vasudevan, Avinash Malik, David Gregg |
An improved simulated annealing heuristic for static partitioning of task graphs onto heterogeneous architectures 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp95-102. DOI: http://dx.doi.org/10.1109/PADSW.2014.7097796 |
|
2013 |
Mounira Bachir, Sid Ahmed Ali Touati, Frederic Brault, David Gregg and Albert Cohen |
Minimal unroll factor for code generation of software pipelining International Journal of Parallel Programming, 41(1), pp1-58. DOI: http://dx.doi.org/10.1007/s10766-012-0203-z |
|
2013 |
Stephen Dolan, Servesh Muralidharan and David Gregg |
Compiler Support for Lightweight Context Switching ACM Transactions on Architecture and Code Optimization, 9(4), pp36.1-36.25. DOI: http://dx.doi.org/10.1145/2400682.2400695 |
|
2013 |
Avinash Malik and David Gregg |
Orchestratin stream graphs using model checking ACM Transactions on Architecture and Code Optimization. DOI: http://dx.doi.org/10.1145/2509420.2512435 |
|
2013 |
Jimmy Cleary, Owen Callanan, Mark Purcell, David Gregg |
Fast Asymmetric Thread Synchronization ACM Transactions on Architecture and Code Optimization, 9(4), pp27.1-27.22. DOI: http://dx.doi.org/10.1145/2400682.2400686 |
|
2013 |
Servesh Muralidharan, Aravind Vasudevan, Avinash Malik and David Gregg |
Heterogeneous Multiconstraint Application Partitioner (HMAP) 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp999 - 1007. DOI: http://dx.doi.org/10.1109/TrustCom.2013.122 |
|
2012 |
Paul Biggar, Edsko de Vries, David Gregg |
A practical solution for achieving language compatibility in scripting language compilers Science of Computer Programming, 77(9), pp971-989. DOI: http://dx.doi.org/10.1016/j.scico.2011.01.004 |
|
2012 |
Jason McCandless and David Gregg |
Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions ACM Transactions on Architecture and Code Optimization, 8(4), pp24.1-24.20. DOI: http://dx.doi.org/10.1145/2086696.2086703 |
|
2012 |
Mark Purcell, Aravind Vasudevan and David Gregg |
Real-time sensor signal capture from a harsh environment 16th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2012), pp36-43. DOI: http://dx.doi.org/10.1109/ds-rt.2012.14 |
|
2012 |
Mark Purcell, Aravind Vasudevan, David Gregg |
Real-time sensor signal capture from a harsh environment 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications, pp36-43. DOI: http://dx.doi.org/10.1109/DS-RT.2012.14 |
|
2011 |
J. McCandless and D. Gregg |
Optimizing interpreters by tuning opcode orderings on virtual machines for modern architectures Conference on the Principles and Practice of Programming in Java, PPPJ 11, pp161-170. DOI: http://dx.doi.org/10.1145/2093157.2093183 |
|
2010 |
Kevin Williams, Jason McCandless and David Gregg |
Portable Just-in-Time Specialization of Dynamically Typed Scripting Languages 22nd InternationalWorkshop, LCPC 2009, 5898, pp391-398. DOI: http://dx.doi.org/10.1007/978-3-642-13374-9_27 |
|
2010 |
Raymond Manley, Paul Magrath and David Gregg |
Code generation for hardware accelerated AES 21st IEEE International Conference on Application-specific Systems Architectures and Processors, pp345-348. DOI: http://dx.doi.org/10.1109/ASAP.2010.5540955 |
|
2010 |
Raymond Manley and David Gregg |
Mapping Streaming Languages to General Purpose Processors through Vectorization 22nd International Workshop, LCPC 2009, 5898, pp95-110. DOI: http://dx.doi.org/10.1007/978-3-642-13374-9_7 |
|
2010 |
Raymond Manley and David Gregg |
A Program Generator for Intel AES-NI Instructions 11th International Conference on Cryptology in India (INDOCRYPT 2010), 6498, pp311-327. DOI: http://dx.doi.org/10.1007/978-3-642-17401-8_22 |