To put this in context, I did a quick measurement of opt compilation time on ia32-linux. What this feature would accomplish is to attempt to reduce compilation time spent in OPT_ExpandRuntimeServices at the cost of doing some fairly significant engineering to cache & patch in opt compiler IR. Historically, we've considered this several times, but have never been motivated enough by the potential reduction in compilation time to do the work. Things change as the system evolves, but just wanted to note that this has been considered in the past and not deemed to be worth the engineering it would take to accomplish it.
With the current system forcing compilation of every method in _213_javac we get:
at O0 expand runtime services is 2.7% of compilation time (we don't inline allocations at O0).
at O1 expand runtime services is 19.4% of compilation time
at O2 expand runtime services is 18.7% of compilation time (currently O1 is almost the same as O2).
Using the default (O1) optimization level, the command:
../rvmRoot/dist/production_ia32-linux/rvm -X:vm:measureCompilation=true -X:aos:enable_recompilation=false -X:aos:initial_compiler=opt SpecApplication -s100 -m1 -M1 -a _213_javac
yields:
Compilation Subsystem Report
Comp #Meths Time bcb/ms mcb/bcb MCKB BCKB
JNI 19 0.39 NA NA 4.5 NA
Base 115 7.09 1083.66 7.64 79.1 10.4
Opt 684 8015.46 11.91 7.76 596.2 76.8
Baseline Compiler SubSystem
Phase Time
(ms) (%ofTotal)
Compute GC Maps 1.78 28.28
OSR setup 0.02 0.24
Code generation 3.78 59.90
Encode GC/MC maps 0.73 11.57
TOTAL 6.31
Optimizing Compiler SubSystem
Phase Time
(ms) (%ofTotal)
Convert Bytecodes to HIR
Generate HIR 447 5.61%
AdjustBytecodeIndexes 0 0.00%
OSR_OsrPointConstructor 82 1.04%
Branch Optimizations 34 0.43%
Adjust Branch Probabilities 5 0.06%
TOTAL 570 7.14%
CFG Transformations
Tail Recursion Elimination 2 0.04%
Basic Block Frequency Estima
Build LST 46 0.58%
Estimate Block Frequenci 23 0.30% 25.57% Infrequent BBs
TOTAL 70 0.88%
Static Splitting 12 0.16%
Branch Optimizations 42 0.54%
TOTAL 128 1.61%
CFG Structural Analysis
Build LST 25 0.33%
Yield Point Insertion 2 0.04%
Estimate Block Frequencies 15 0.19% 25.21% Infrequent BBs
TOTAL 44 0.55%
Simple Opts 91 1.14%
Escape Transformations 56 0.71%
Branch Optimizations 29 0.37%
Local CopyProp 12 0.15% 25.18% Infrequent BBs
Local ConstantProp 7 0.09% 25.18% Infrequent BBs
Local CSE 39 0.50% 25.18% Infrequent BBs
Field Analysis 3 0.04%
Convert HIR to LIR
Expand Runtime Services 1552 19.44% 13.37% Infrequent RS calls
Branch Optimizations 88 1.11%
Local Cast Optimizations 9 0.12% 10.27% Infrequent BBs
HIR Operator Expansion 65 0.82%
Branch Optimizations 136 1.70%
Adjust Branch Probabilities 10 0.13%
TOTAL 1863 23.33%
Local CopyProp 32 0.41% 8.84% Infrequent BBs
Local ConstantProp 65 0.82% 8.84% Infrequent BBs
Local CSE 43 0.54% 8.87% Infrequent BBs
Simple Opts 203 2.55%
Basic Block Frequency Estimation
Build LST 73 0.92%
Estimate Block Frequencies 45 0.57% 30.13% Infrequent BBs
TOTAL 118 1.49%
Code Reordering 183 2.30%
Branch Optimizations 121 1.52%
Convert LIR to MIR
SplitBasicBlock 4 0.06%
Instruction Selection
Reduce Operators 8 0.11%
ConvertALUOps 153 1.93%
Normalize Constants 37 0.47%
Live Handlers 0 0.00%
DepGraph & BURS 1065 13.35% 30.77% Infrequent BBs
Complex Operators 47 0.59%
NullCheckCombining 29 0.37%
TOTAL 1342 16.82%
TOTAL 1347 16.88%
Register Mapping
MIR Range Splitting 21 0.27%
Expand Calling Convention 109 1.37%
Expand Calling Convention 0 0.00%
Live Analysis 352 4.42%
Register Allocation
Register Allocation Prep 90 1.13%
Linear Scan Composite Ph
Interval Analysis 205 2.57%
Register Restriction 171 2.15%
Linear Scan 969 12.14%
Update GCMaps 1 8 0.11%
Spill Code 468 5.87%
Update GCMaps 2 50 0.63%
Update OSRMaps 2 0.04%
TOTAL 1877 23.51%
TOTAL 1967 24.63%
Insert Prologue/Epilogue 82 1.03%
TOTAL 2533 31.73%
Branch Optimizations 61 0.77%
Generate Machine Code
Final MIR Expansion 38 0.49%
Assembler Driver 388 4.87%
TOTAL 427 5.35%
TOTAL COMPILATION TIME 7985
To put this in context, I did a quick measurement of opt compilation time on ia32-linux. What this feature would accomplish is to attempt to reduce compilation time spent in OPT_ExpandRuntimeServices at the cost of doing some fairly significant engineering to cache & patch in opt compiler IR. Historically, we've considered this several times, but have never been motivated enough by the potential reduction in compilation time to do the work. Things change as the system evolves, but just wanted to note that this has been considered in the past and not deemed to be worth the engineering it would take to accomplish it.
With the current system forcing compilation of every method in _213_javac we get:
at O0 expand runtime services is 2.7% of compilation time (we don't inline allocations at O0).
at O1 expand runtime services is 19.4% of compilation time
at O2 expand runtime services is 18.7% of compilation time (currently O1 is almost the same as O2).
Using the default (O1) optimization level, the command:
../rvmRoot/dist/production_ia32-linux/rvm -X:vm:measureCompilation=true -X:aos:enable_recompilation=false -X:aos:initial_compiler=opt SpecApplication -s100 -m1 -M1 -a _213_javac
yields:
Compilation Subsystem Report
Comp #Meths Time bcb/ms mcb/bcb MCKB BCKB
JNI 19 0.39 NA NA 4.5 NA
Base 115 7.09 1083.66 7.64 79.1 10.4
Opt 684 8015.46 11.91 7.76 596.2 76.8
Baseline Compiler SubSystem
Phase Time
(ms) (%ofTotal)
Compute GC Maps 1.78 28.28
OSR setup 0.02 0.24
Code generation 3.78 59.90
Encode GC/MC maps 0.73 11.57
TOTAL 6.31
Optimizing Compiler SubSystem
Phase Time
(ms) (%ofTotal)
Convert Bytecodes to HIR
Generate HIR 447 5.61%
AdjustBytecodeIndexes 0 0.00%
OSR_OsrPointConstructor 82 1.04%
Branch Optimizations 34 0.43%
Adjust Branch Probabilities 5 0.06%
TOTAL 570 7.14%
CFG Transformations
Tail Recursion Elimination 2 0.04%
Basic Block Frequency Estima
Build LST 46 0.58%
Estimate Block Frequenci 23 0.30% 25.57% Infrequent BBs
TOTAL 70 0.88%
Static Splitting 12 0.16%
Branch Optimizations 42 0.54%
TOTAL 128 1.61%
CFG Structural Analysis
Build LST 25 0.33%
Yield Point Insertion 2 0.04%
Estimate Block Frequencies 15 0.19% 25.21% Infrequent BBs
TOTAL 44 0.55%
Simple Opts 91 1.14%
Escape Transformations 56 0.71%
Branch Optimizations 29 0.37%
Local CopyProp 12 0.15% 25.18% Infrequent BBs
Local ConstantProp 7 0.09% 25.18% Infrequent BBs
Local CSE 39 0.50% 25.18% Infrequent BBs
Field Analysis 3 0.04%
Convert HIR to LIR
Expand Runtime Services 1552 19.44% 13.37% Infrequent RS calls
Branch Optimizations 88 1.11%
Local Cast Optimizations 9 0.12% 10.27% Infrequent BBs
HIR Operator Expansion 65 0.82%
Branch Optimizations 136 1.70%
Adjust Branch Probabilities 10 0.13%
TOTAL 1863 23.33%
Local CopyProp 32 0.41% 8.84% Infrequent BBs
Local ConstantProp 65 0.82% 8.84% Infrequent BBs
Local CSE 43 0.54% 8.87% Infrequent BBs
Simple Opts 203 2.55%
Basic Block Frequency Estimation
Build LST 73 0.92%
Estimate Block Frequencies 45 0.57% 30.13% Infrequent BBs
TOTAL 118 1.49%
Code Reordering 183 2.30%
Branch Optimizations 121 1.52%
Convert LIR to MIR
SplitBasicBlock 4 0.06%
Instruction Selection
Reduce Operators 8 0.11%
ConvertALUOps 153 1.93%
Normalize Constants 37 0.47%
Live Handlers 0 0.00%
DepGraph & BURS 1065 13.35% 30.77% Infrequent BBs
Complex Operators 47 0.59%
NullCheckCombining 29 0.37%
TOTAL 1342 16.82%
TOTAL 1347 16.88%
Register Mapping
MIR Range Splitting 21 0.27%
Expand Calling Convention 109 1.37%
Expand Calling Convention 0 0.00%
Live Analysis 352 4.42%
Register Allocation
Register Allocation Prep 90 1.13%
Linear Scan Composite Ph
Interval Analysis 205 2.57%
Register Restriction 171 2.15%
Linear Scan 969 12.14%
Update GCMaps 1 8 0.11%
Spill Code 468 5.87%
Update GCMaps 2 50 0.63%
Update OSRMaps 2 0.04%
TOTAL 1877 23.51%
TOTAL 1967 24.63%
Insert Prologue/Epilogue 82 1.03%
TOTAL 2533 31.73%
Branch Optimizations 61 0.77%
Generate Machine Code
Final MIR Expansion 38 0.49%
Assembler Driver 388 4.87%
TOTAL 427 5.35%
TOTAL COMPILATION TIME 7985