Pyrogenesis  13997
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Profiler2.h
Go to the documentation of this file.
1 /* Copyright (c) 2011 Wildfire Games
2  *
3  * Permission is hereby granted, free of charge, to any person obtaining
4  * a copy of this software and associated documentation files (the
5  * "Software"), to deal in the Software without restriction, including
6  * without limitation the rights to use, copy, modify, merge, publish,
7  * distribute, sublicense, and/or sell copies of the Software, and to
8  * permit persons to whom the Software is furnished to do so, subject to
9  * the following conditions:
10  *
11  * The above copyright notice and this permission notice shall be included
12  * in all copies or substantial portions of the Software.
13  *
14  * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15  * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16  * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
17  * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
18  * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
19  * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
20  * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21  */
22 
23 /**
24  * @file
25  * New profiler (complementing the older CProfileManager)
26  *
27  * The profiler is designed for analysing framerate fluctuations or glitches,
28  * and temporal relationships between threads.
29  * This contrasts with CProfilerManager and most external profiling tools,
30  * which are designed more for measuring average throughput over a number of
31  * frames.
32  *
33  * To view the profiler output, press F11 to enable the HTTP output mode
34  * and then open source/tools/profiler2/profiler2.html in a web browser.
35  *
36  * There is a single global CProfiler2 instance (g_Profiler2),
37  * providing the API used by the rest of the game.
38  * The game can record the entry/exit timings of a region of code
39  * using the PROFILE2 macro, and can record other events using
40  * PROFILE2_EVENT.
41  * Regions and events can be annotated with arbitrary string attributes,
42  * specified with printf-style format strings, using PROFILE2_ATTR
43  * (e.g. PROFILE2_ATTR("frame: %d", m_FrameNum) ).
44  *
45  * This is designed for relatively coarse-grained profiling, or for rare events.
46  * Don't use it for regions that are typically less than ~0.1msecs, or that are
47  * called hundreds of times per frame. (The old CProfilerManager is better
48  * for that.)
49  *
50  * New threads must call g_Profiler2.RegisterCurrentThread before any other
51  * profiler functions.
52  *
53  * The main thread should call g_Profiler2.RecordFrameStart at the start of
54  * each frame.
55  * Other threads should call g_Profiler2.RecordSyncMarker occasionally,
56  * especially if it's been a long time since their last call to the profiler,
57  * or if they've made thousands of calls since the last sync marker.
58  *
59  * The profiler is implemented with thread-local fixed-size ring buffers,
60  * which store a sequence of variable-length items indicating the time
61  * of the event and associated data (pointers to names, attribute strings, etc).
62  * An HTTP server provides access to the data: when requested, it will make
63  * a copy of a thread's buffer, then parse the items and return them in JSON
64  * format. The profiler2.html requests and processes and visualises this data.
65  *
66  * The RecordSyncMarker calls are necessary to correct for time drift and to
67  * let the buffer parser accurately detect the start of an item in the byte stream.
68  *
69  * This design aims to minimise the performance overhead of recording data,
70  * and to simplify the visualisation of the data by doing it externally in an
71  * environment with better UI tools (i.e. HTML) instead of within the game engine.
72  *
73  * The initial setup of g_Profiler2 must happen in the game's main thread.
74  * RegisterCurrentThread and the Record functions may be called from any thread.
75  * The HTTP server runs its own threads, which may call the ConstructJSON functions.
76  */
77 
78 #ifndef INCLUDED_PROFILER2
79 #define INCLUDED_PROFILER2
80 
81 #include "lib/timer.h"
82 #include "ps/ThreadUtil.h"
83 
84 struct mg_context;
85 
86 // Note: Lots of functions are defined inline, to hypothetically
87 // minimise performance overhead.
88 
89 class CProfiler2GPU;
90 
92 {
93  friend class CProfiler2GPU_base;
94 
95 public:
96  // Items stored in the buffers:
97 
98  /// Item type identifiers
99  enum EItem
100  {
101  ITEM_NOP = 0,
102  ITEM_SYNC = 1, // magic value used for parse syncing
103  ITEM_EVENT = 2, // single event
104  ITEM_ENTER = 3, // entering a region
105  ITEM_LEAVE = 4, // leaving a region (must be correctly nested)
106  ITEM_ATTRIBUTE = 5, // arbitrary string associated with current region, or latest event (if the previous item was an event)
107  };
108 
109  static const size_t MAX_ATTRIBUTE_LENGTH = 256; // includes null terminator, which isn't stored
110 
111  /// An arbitrary number to help resyncing with the item stream when parsing.
112  static const u8 RESYNC_MAGIC[8];
113 
114  /**
115  * An item with a relative time and an ID string pointer.
116  */
117  struct SItem_dt_id
118  {
119  float dt; // time relative to last event
120  const char* id;
121  };
122 
123 private:
124  // TODO: what's a good size?
125  // TODO: different threads might want different sizes
126  static const size_t BUFFER_SIZE = 1024*1024;
127 
128  /**
129  * Class instantiated in every registered thread.
130  */
132  {
134  public:
135  ThreadStorage(CProfiler2& profiler, const std::string& name);
136  ~ThreadStorage();
137 
138  void RecordSyncMarker(double t)
139  {
140  // Store the magic string followed by the absolute time
141  // (to correct for drift caused by the precision of relative
142  // times stored in other items)
143  u8 buffer[sizeof(RESYNC_MAGIC) + sizeof(t)];
144  memcpy(buffer, &RESYNC_MAGIC, sizeof(RESYNC_MAGIC));
145  memcpy(buffer + sizeof(RESYNC_MAGIC), &t, sizeof(t));
146  Write(ITEM_SYNC, buffer, ARRAY_SIZE(buffer));
147  m_LastTime = t;
148  }
149 
150  void Record(EItem type, double t, const char* id)
151  {
152  // Store a relative time instead of absolute, so we can use floats
153  // (to save memory) without suffering from precision problems
154  SItem_dt_id item = { (float)(t - m_LastTime), id };
155  Write(type, &item, sizeof(item));
156  m_LastTime = t;
157  }
158 
159  void RecordFrameStart(double t)
160  {
161  RecordSyncMarker(t);
162  Record(ITEM_EVENT, t, "__framestart"); // magic string recognised by the visualiser
163  }
164 
165  void RecordAttribute(const char* fmt, va_list argp) VPRINTF_ARGS(2);
166 
167  void RecordAttributePrintf(const char* fmt, ...) PRINTF_ARGS(2)
168  {
169  va_list argp;
170  va_start(argp, fmt);
171  RecordAttribute(fmt, argp);
172  va_end(argp);
173  }
174 
176  {
177  return m_Profiler;
178  }
179 
180  const std::string& GetName()
181  {
182  return m_Name;
183  }
184 
185  /**
186  * Returns a copy of a subset of the thread's buffer.
187  * Not guaranteed to start on an item boundary.
188  * May be called by any thread.
189  */
190  std::string GetBuffer();
191 
192  private:
193  /**
194  * Store an item into the buffer.
195  */
196  void Write(EItem type, const void* item, u32 itemSize)
197  {
198  // See m_BufferPos0 etc for comments on synchronisation
199 
200  u32 size = 1 + itemSize;
201  u32 start = m_BufferPos0;
202  if (start + size > BUFFER_SIZE)
203  {
204  // The remainder of the buffer is too small - fill the rest
205  // with NOPs then start from offset 0, so we don't have to
206  // bother splitting the real item across the end of the buffer
207 
208  m_BufferPos0 = size;
209  COMPILER_FENCE; // must write m_BufferPos0 before m_Buffer
210 
211  memset(m_Buffer + start, 0, BUFFER_SIZE - start);
212  start = 0;
213  }
214  else
215  {
216  m_BufferPos0 = start + size;
217  COMPILER_FENCE; // must write m_BufferPos0 before m_Buffer
218  }
219 
220  m_Buffer[start] = (u8)type;
221  memcpy(&m_Buffer[start + 1], item, itemSize);
222 
223  COMPILER_FENCE; // must write m_BufferPos1 after m_Buffer
224  m_BufferPos1 = start + size;
225  }
226 
228  std::string m_Name;
229 
230  double m_LastTime; // used for computing relative times
231 
233 
234  // To allow hopefully-safe reading of the buffer from a separate thread,
235  // without any expensive synchronisation in the recording thread,
236  // two copies of the current buffer write position are stored.
237  // BufferPos0 is updated before writing; BufferPos1 is updated after writing.
238  // GetBuffer can read Pos1, copy the buffer, read Pos0, then assume any bytes
239  // outside the range Pos1 <= x < Pos0 are safe to use. (Any in that range might
240  // be half-written and corrupted.) (All ranges are modulo BUFFER_SIZE.)
241  // Outside of Write(), these will always be equal.
242  //
243  // TODO: does this attempt at synchronisation (plus use of COMPILER_FENCE etc)
244  // actually work in practice?
247  };
248 
249 public:
250  CProfiler2();
251  ~CProfiler2();
252 
253  /**
254  * Call in main thread to set up the profiler,
255  * before calling any other profiler functions.
256  */
257  void Initialise();
258 
259  /**
260  * Call in main thread to enable the HTTP server.
261  * (Disabled by default for security and performance
262  * and to avoid annoying a firewall.)
263  */
264  void EnableHTTP();
265 
266  /**
267  * Call in main thread to enable the GPU profiling support,
268  * after OpenGL has been initialised.
269  */
270  void EnableGPU();
271 
272  /**
273  * Call in main thread to shut down the GPU profiling support,
274  * before shutting down OpenGL.
275  */
276  void ShutdownGPU();
277 
278  /**
279  * Call in main thread to shut everything down.
280  * All other profiled threads should have been terminated already.
281  */
282  void Shutdown();
283 
284  /**
285  * Call in any thread to enable the profiler in that thread.
286  * @p name should be unique, and is used by the visualiser to identify
287  * this thread.
288  */
289  void RegisterCurrentThread(const std::string& name);
290 
291  /**
292  * Non-main threads should call this occasionally,
293  * especially if it's been a long time since their last call to the profiler,
294  * or if they've made thousands of calls since the last sync marker.
295  */
297  {
299  }
300 
301  /**
302  * Call in main thread at the start of a frame.
303  */
305  {
308  }
309 
310  void RecordEvent(const char* id)
311  {
313  }
314 
315  void RecordRegionEnter(const char* id)
316  {
318  }
319 
320  void RecordRegionLeave(const char* id)
321  {
323  }
324 
325  void RecordAttribute(const char* fmt, ...) PRINTF_ARGS(2)
326  {
327  va_list argp;
328  va_start(argp, fmt);
329  GetThreadStorage().RecordAttribute(fmt, argp);
330  va_end(argp);
331  }
332 
333  void RecordGPUFrameStart();
334  void RecordGPUFrameEnd();
335  void RecordGPURegionEnter(const char* id);
336  void RecordGPURegionLeave(const char* id);
337 
338  /**
339  * Call in any thread to produce a JSON representation of the general
340  * state of the application.
341  */
342  void ConstructJSONOverview(std::ostream& stream);
343 
344  /**
345  * Call in any thread to produce a JSON representation of the buffer
346  * for a given thread.
347  * Returns NULL on success, or an error string.
348  */
349  const char* ConstructJSONResponse(std::ostream& stream, const std::string& thread);
350 
351  /**
352  * Call in any thread to save a JSONP representation of the buffers
353  * for all threads, to a file named profile2.jsonp in the logs directory.
354  */
355  void SaveToFile();
356 
357  double GetTime()
358  {
359  return timer_Time();
360  }
361 
363  {
364  return m_FrameNumber;
365  }
366 
368  {
369  ++m_FrameNumber;
370  }
371 
372  void AddThreadStorage(ThreadStorage* storage);
373  void RemoveThreadStorage(ThreadStorage* storage);
374 
375 private:
376  void InitialiseGPU();
377 
378  static void TLSDtor(void* data);
379 
381  {
383  ASSERT(storage);
384  return *storage;
385  }
386 
388 
390 
392 
394 
396 
398  std::vector<ThreadStorage*> m_Threads; // thread-safe; protected by m_Mutex
399 };
400 
401 extern CProfiler2 g_Profiler2;
402 
403 /**
404  * Scope-based enter/leave helper.
405  */
407 {
408 public:
409  CProfile2Region(const char* name) : m_Name(name)
410  {
411  g_Profiler2.RecordRegionEnter(m_Name);
412  }
414  {
415  g_Profiler2.RecordRegionLeave(m_Name);
416  }
417 private:
418  const char* m_Name;
419 };
420 
421 /**
422  * Scope-based GPU enter/leave helper.
423  */
425 {
426 public:
427  CProfile2GPURegion(const char* name) : m_Name(name)
428  {
429  g_Profiler2.RecordGPURegionEnter(m_Name);
430  }
432  {
433  g_Profiler2.RecordGPURegionLeave(m_Name);
434  }
435 private:
436  const char* m_Name;
437 };
438 
439 /**
440  * Starts timing from now until the end of the current scope.
441  * @p region is the name to associate with this region (should be
442  * a constant string literal; the pointer must remain valid forever).
443  * Regions may be nested, but preferably shouldn't be nested deeply since
444  * it hurts the visualisation.
445  */
446 #define PROFILE2(region) CProfile2Region profile2__(region)
447 
448 #define PROFILE2_GPU(region) CProfile2GPURegion profile2gpu__(region)
449 
450 /**
451  * Record the named event at the current time.
452  */
453 #define PROFILE2_EVENT(name) g_Profiler2.RecordEvent(name)
454 
455 /**
456  * Associates a string (with printf-style formatting) with the current
457  * region or event.
458  * (If the last profiler call was PROFILE2_EVENT, it associates with that
459  * event; otherwise it associates with the currently-active region.)
460  */
461 #define PROFILE2_ATTR g_Profiler2.RecordAttribute
462 
463 #endif // INCLUDED_PROFILER2
#define u8
Definition: types.h:39
bool m_Initialised
Definition: Profiler2.h:387
const std::string & GetName()
Definition: Profiler2.h:180
#define COMPILER_FENCE
prevent the compiler from reordering loads or stores across this point.
#define VPRINTF_ARGS(fmtpos)
CProfile2Region(const char *name)
Definition: Profiler2.h:409
int m_FrameNumber
Definition: Profiler2.h:389
void AddThreadStorage(ThreadStorage *storage)
Definition: Profiler2.cpp:257
void RecordFrameStart()
Call in main thread at the start of a frame.
Definition: Profiler2.h:304
void RecordAttribute(const char *fmt,...) PRINTF_ARGS(2)
Definition: Profiler2.h:325
ThreadStorage(CProfiler2 &profiler, const std::string &name)
Definition: Profiler2.cpp:269
CMutex m_Mutex
Definition: Profiler2.h:397
CProfiler2 & m_Profiler
Definition: Profiler2.h:227
CProfile2GPURegion(const char *name)
Definition: Profiler2.h:427
#define ASSERT(expr)
same as ENSURE in debug mode, does nothing in release mode.
Definition: debug.h:310
#define PRINTF_ARGS(fmtpos)
A non-recursive mutual exclusion lock.
Definition: ThreadUtil.h:45
void EnableHTTP()
Call in main thread to enable the HTTP server.
Definition: Profiler2.cpp:157
Scope-based enter/leave helper.
Definition: Profiler2.h:406
void RecordAttributePrintf(const char *fmt,...) PRINTF_ARGS(2)
Definition: Profiler2.h:167
#define ARRAY_SIZE(name)
void RecordGPURegionEnter(const char *id)
Definition: Profiler2.cpp:217
#define ENSURE(expr)
ensure the expression &lt;expr&gt; evaluates to non-zero.
Definition: debug.h:282
void IncrementFrameNumber()
Definition: Profiler2.h:367
void RecordSyncMarker(double t)
Definition: Profiler2.h:138
CProfiler2 g_Profiler2
Definition: Profiler2.cpp:35
EItem
Item type identifiers.
Definition: Profiler2.h:99
std::string GetBuffer()
Returns a copy of a subset of the thread&#39;s buffer.
Definition: Profiler2.cpp:281
static const u8 RESYNC_MAGIC[8]
An arbitrary number to help resyncing with the item stream when parsing.
Definition: Profiler2.h:112
void RegisterCurrentThread(const std::string &name)
Call in any thread to enable the profiler in that thread.
Definition: Profiler2.cpp:241
double GetTime()
Definition: Profiler2.h:357
const char * m_Name
Definition: Profiler2.h:436
void SaveToFile()
Call in any thread to save a JSONP representation of the buffers for all threads, to a file named pro...
Definition: Profiler2.cpp:540
void RecordAttribute(const char *fmt, va_list argp) VPRINTF_ARGS(2)
Definition: Profiler2.cpp:305
ThreadStorage & GetThreadStorage()
Definition: Profiler2.h:380
void ShutdownGPU()
Call in main thread to shut down the GPU profiling support, before shutting down OpenGL.
Definition: Profiler2.cpp:181
static const size_t BUFFER_SIZE
Definition: Profiler2.h:126
const char * m_Name
Definition: Profiler2.h:418
void Record(EItem type, double t, const char *id)
Definition: Profiler2.h:150
void Write(EItem type, const void *item, u32 itemSize)
Store an item into the buffer.
Definition: Profiler2.h:196
void RecordGPURegionLeave(const char *id)
Definition: Profiler2.cpp:223
void Initialise()
Call in main thread to set up the profiler, before calling any other profiler functions.
Definition: Profiler2.cpp:141
CProfiler2 & GetProfiler()
Definition: Profiler2.h:175
double timer_Time()
Definition: timer.cpp:98
void * pthread_getspecific(pthread_key_t key)
Definition: wpthread.cpp:184
void RecordRegionEnter(const char *id)
Definition: Profiler2.h:315
void Shutdown()
Call in main thread to shut everything down.
Definition: Profiler2.cpp:186
#define u32
Definition: types.h:41
mg_context * m_MgContext
Definition: Profiler2.h:391
Used by CProfiler2 for GPU profiling support.
Definition: Profiler2GPU.h:31
void RecordGPUFrameStart()
Definition: Profiler2.cpp:205
CProfiler2GPU * m_GPU
Definition: Profiler2.h:395
void EnableGPU()
Call in main thread to enable the GPU profiling support, after OpenGL has been initialised.
Definition: Profiler2.cpp:174
pthread_key_t m_TLS
Definition: Profiler2.h:393
void RecordEvent(const char *id)
Definition: Profiler2.h:310
bool IsMainThread()
Returns whether the current thread is the &#39;main&#39; thread (i.e.
Definition: ThreadUtil.cpp:25
void RemoveThreadStorage(ThreadStorage *storage)
Definition: Profiler2.cpp:263
void RecordRegionLeave(const char *id)
Definition: Profiler2.h:320
static const size_t MAX_ATTRIBUTE_LENGTH
Definition: Profiler2.h:109
void InitialiseGPU()
Definition: Profiler2.cpp:151
void RecordGPUFrameEnd()
Definition: Profiler2.cpp:211
Scope-based GPU enter/leave helper.
Definition: Profiler2.h:424
void ConstructJSONOverview(std::ostream &stream)
Call in any thread to produce a JSON representation of the general state of the application.
Definition: Profiler2.cpp:325
unsigned int pthread_key_t
Definition: wpthread.h:93
void RecordSyncMarker()
Non-main threads should call this occasionally, especially if it&#39;s been a long time since their last ...
Definition: Profiler2.h:296
An item with a relative time and an ID string pointer.
Definition: Profiler2.h:117
std::vector< ThreadStorage * > m_Threads
Definition: Profiler2.h:398
int GetFrameNumber()
Definition: Profiler2.h:362
static void TLSDtor(void *data)
Called by pthreads when a registered thread is destroyed.
Definition: Profiler2.cpp:232
Class instantiated in every registered thread.
Definition: Profiler2.h:131
const char * ConstructJSONResponse(std::ostream &stream, const std::string &thread)
Call in any thread to produce a JSON representation of the buffer for a given thread.
Definition: Profiler2.cpp:502
NONCOPYABLE(ThreadStorage)
void RecordFrameStart(double t)
Definition: Profiler2.h:159