For emitting user defined events into AppFabric's monitoring database, I'm using the WCFUserEventProvider class (which is available in the WCF 4.0 samples). Basically this class wraps the EventProvider class which is the underlying class thatprovides the ability to emit user defined events to ETW.
During some performance/load testing of a WCF service, the apppool process continued to use more and more memory, which resulted in gradual reduction in throughput. Eventually I discovered I was instantiating a new instance if the WCFUserEventProvider class (therefore creating a new instance of the EventProvider class) everytime an user defined event was logged.
Once I changed my implementation to just have a static instance of EventProvider, no more memory leakage, and the throughput remained constant.
It's the construction of the EventProvider class that is the costly part, not the actual submitting of the event.