Overview
The ACS Performance framework was created primarly as a means to
determine
the performance limitations of various ACS APIs. This includes very
low-
level tests such as determining how many method invocations can be
invoked on a component per second to more abstract tests such as
determining how long it takes to start the core of ACS. The framework
has been created in such a way that not only is it useful to ACS, it
can be used by other ALMA software subsystems as well. It consists of:
- Profiler objects in C++,
Java, and Python. These profilers are essentially stopwatches that
obtain interesting data about a particular block of code.
- A bash script named acsutilProfiler
which profiles entire executables.
- A report generator which turns the raw output from the profilers
into human-readable reports.
Interesting ACS 4.1.0 Performance Facts
On a single host:
- The amount of time it takes to compile Java files has decreased
by nearly one-half since ACS 4.0
- The acsStartORBSRVC
command executes nearly 10% slower under ACS 4.1
- The fluctuation of invocation times of Java methods varies
greatly with 4.1 compared to ACS 4.0.
- C++ logging performance is by and far at the same level it was
with ACS 4.0. This does not hold true for large log sizes though
(bigger than 1 Kilobyte) where the amount of time it takes to send a
log can be roughly ten times what it used to be. Java logging has taken
a small performance hit as well while Python logging has improved
overall.
- The fluctuation of invocation times of sending Python events is
quite large compared to 4.0, but overall the numbers remain about the
same. There is little change with the Java and C++ APIs.
- For ACS exceptions of significant depth (200 or more),
performance has significantly increased by what is generally a factor
of ~2.
- There seems to be a slight improvement with the Bulk Data API
under most cases.
On a 1 Gig Ethernet connection utilizing remote containers:
- The data retrieved from the remote tests is essentially in
agreement with the local tests. The only change here is that
performance losses/gains are not as noticable because of the added
overhead of network communications.
- Bulk Data performance tests hang. To be investigated.
Please see the links at the bottom of this page for the complete reports!
High-level Guide to Using the ACS Benchmarking Suite in Your Own
Code
To test the performance of your own code, the following simple steps
should be performed:
- Select a block of code to be analyzed.
- Write a client which invokes the chosen method n times. N should
be a reasonably large number to obtain accurate measurements.
- Incorporate Profiler
objects into your client.
- Startup ACS and any applicable containers.
- Run the client and save it's output to file.
- Import the file containing the client's output into a database
using tools provided by ACS.
- Generate HTML reports using the performance database.
EDITORS NOTE:
At the moment, there is no official ACS documentation on incorporating
Profiler objects into your own
code or covering usage of the
acsutilProfiler script used to
profile entire
executables. Most likely this document will be created for ACS 5.0
after major enhancements are made to the benchmarking suite. For the
time being, please take a look at
ACS/Benchmark/components/test/genCompLoggingReport
for a precise
example of the steps described above and
ACS/Benchmark/components/test/genStartupTest
for usage of
acsutilProfiler. Additionally, there is a Power Point Presentation
located in ACS/Documents/ACS-Course/ACSCourse-Performance.ppt
Old Performance Facts
Interesting ACS 4.0.1 Performance Facts
On a 1 Gig Ethernet connection utilizing remote containers:
- It takes half a millisecond to invoke a C++ or Python
component's method and 2.5ms to invoke a Java
component's method.
- It takes 46ms to invoke a C++ component's method
which returns 500KBytes of data from C++ or Python clients. From Java,
it takes 87ms.
- Python and Java perform nearly one order of magnitude slower than
C++ with respect to logging.
- Java clients consistently catch CORBA exceptions faster than
C++/Python (by roughly 15%).
- Using the Bulk Data API, data rates of 16MB/second are achieved
for large data sets. CPU speed played a large role in this.
- The Event Channel API has no problems handling one-thousand
events per second of size 1KB in a very simple scenario.
- In general, C++/Python are neck and neck with respect to
performance although Java is fine for everything except large data sets.
ESO | ALMA Common
Software | Contact
Us
Modified on Thursday, 02-Dec-2004 13:19:50 MDT