Содержание
- 2. About the Data ATG reviews code to find bottlenecks and make perf recommendations 50 titles per
- 3. Why This Talk Is Important The majority of Xbox games are CPU bound The CPU bottleneck
- 4. Format Definition of the problem Examples Recommendation For reference A frame is 17 or 33 ms
- 5. Issue: STL Game using std::list Adding ~20,000 objects every frame Rebuilding the list every frame Time
- 6. std::set and map Many games use set/map as sorted lists Inserts are slow (log(N)) Memory overhead:
- 7. std::vector Hundreds of push_back()s per frame VS7.1 expands vector by 50% Question: How many reallocations for
- 8. Clearly, the STL is Evil
- 9. Use the Right Tool for the Job The STL is powerful, but it’s not free Filling
- 10. The STL is Evil, Sometimes The STL doesn’t solve every problem The STL solves some problems
- 11. Issue: NIH Syndrome Example: Custom binary tree Sorted list of transparent objects Badly unbalanced 1 ms/frame
- 12. Optimizations that Aren’t void appMemcpy( void* d, const void* s, size_t b ) { // lots
- 13. Invent Only What You Need std::set/map more efficient than the custom tree by 10X Tested and
- 14. Profile Run your profiler Rinse. Repeat. Prove the improvement. Don’t rewrite the C runtime or STL
- 15. Issue: Tool Knowledge If you’re a programmer, you use C/C++ every day C++ is complex CRT
- 16. vector::clear Game reused global vector in frame loop clear() called every frame to empty the vector
- 17. Zero-Initialization struct Array { int x[1000]; }; struct Container { Array arr; Container() : arr() {
- 18. Know Thine Holy Standard Use resize(0) to reduce container size without affecting capacity T() means zero-initialize
- 19. Issue: C Runtime void BuildScore( char* s, int n ) { if( n > 0 )
- 20. qsort Sorting is important in games qsort is not an ideal sorting function No type safety
- 21. Clearly, the CRT is Evil
- 22. Understand Your Options itoa() can replace sprintf( s, “%d”, n ) *s = ‘\0’ can replace
- 23. Issue: Function Calls 50,000-100,000 calls/frame is normal At 60Hz, Xbox has 12.2M cycles/frame Function call/return averages
- 24. Extreme Function-ality 120,000 functions/frame 140,000 functions/frame 130,000 calls to a single function/frame (ColumnVec ::operator[]) And the
- 25. Beware Elegance Elegance → levels of indirection → more functions → perf impact Use algorithmic solutions
- 26. Inline Judiciously Remember: inline is a suggestion Try “inline any suitable” compiler option 15 to 20
- 27. Issue: for loops // Example 1: Copy indices to push buffer for( DWORD i = 0;
- 28. Watch Out For For Never copy/clear a POD with a for loop std::algorithms are optimized; use
- 29. Issue: Exception Handling Most games never throw Most games never catch Yet, most games enable EH
- 30. Disable Exception Handling Don’t throw or catch exceptions Turn off the C++ EH compiler option For
- 31. Issue: Strings Programmers love strings Love hurts ~7000 calls to stricmp in frame loop 1.5 ms/frame
- 32. Avoid strings String comparisons don’t belong in the frame loop Put strings in an table and
- 33. Issue: Memory Allocation Memory overhead Xbox granularity/overhead is 16/16 bytes Overhead alone is often 1+ MB
- 34. Hidden Allocations push_back(), insert() and friends typically allocate memory String constructors allocate Init-style calls often allocate
- 35. Minimize Per-Frame Allocations Use memory-friendly data structures, e.g. arrays, vectors Reserve memory in advance Use custom
- 36. Other Tidbits Compiler settings: experiment dynamic_cast: just say no Constructors: performance killers Unused static array space:
- 37. Wrap Up Use the Right Tool for the Job The STL is Evil, Sometimes Invent Only
- 38. Call to Action: Evolve! Pass the rubber chicken Share your C++ performance mistakes with your team
- 40. Скачать презентацию