Using Insight Operations
Note: Spring Insight templates are no longer supported. The documentation is here for reference only.
Insight Operations enables you to see at a glance how your applications and servers are doing. You can immediately see the overall health of the cluster and, with a few clicks, view the performance of individual applications, end points, or servers.
- Invoke Spring Insight in your Browser
- Browse all Applications and Endpoints
- View Recent Activity of Your Application
- Filter Trace Details
- Import and Export Traces
- Customize Endpoint Thresholds
- Enable and Disable Collection Strategies
- View Spring Insight Data In Google Speed Tracer
Invoke Spring Insight Operations in your browser.
Use your browser to browse to the Spring Insight dashboard using the following URL:
dashboardServerwith the name or IP number of the host running the Dashboard and
portwith the HTTP listen port, 8080 by default. If you have set up SSL, you can also use
https:// dashboardServer:SSLport/insight. Replace
SSLportwith the configured SSL port number, 8443 by default.
The browser invokes the main Spring Insight dashboard on the Browse Resources screen:
Spring Insight traces all application activity and displays it on the dashboard. For example, if you performed any JDBC queries, each one is shown in the Spring Insight dashboard along with a time line of recent requests.
The Browse Resources screen shows information for all types of applications, but it is especially effective (and shows the most detail) for Spring, Grails, and Roo applications.
From the main Spring Insight dashboard, click the Browse Resources link:
On the left, the root tree node, All Applications, is selected in the APPLICATIONS panel and All Instances is selected in the INSTANCES panel. The APPLICATIONS HEALTH TREND panel displays a graph of the recent health of all applications on all servers. The APPLICATIONS panel on the right, beneath the graph, displays a list of the applications with some statistics for the application for the time period displayed by the graph:
- The Health Trend column shows a simple sparkline that graphically describes the recent health of the application.
- The Throughput column shows how many Traces per Minute (tpm) were executed over the current time window.
- The Errors column shows what percentage of Traces resulted in an Error (HTTP status 500 to 600).
The graphical or tabular view of all applications is useful to see which applications have been busiest and to compare their relative health. Click in the graphs to see the application-specific information. Click the column headers in the table to sort the information based on the values in the column.
The INSTANCES panel at the bottom of the right column displays the same statistics for each server in the cluster. From this list, you can easily identify underperforming or failing servers.
With All Applications selected in the left APPLICATIONS panel, click on an individual server in the left INSTANCES panel. This view displays a Health Trend graph for all applications running on the selected server and detailed information for the server.
The heading of the graph panel changes to identify the server you selected. The Vitals section of the graph panel displays performance statistics for the server for the time period depicted on the graph:
- Throughput: How many Traces per Minute (tpm) were executed during the current time window.
- Invocations: Total number of requests made during the current time window.
- Error rate: Percentage of invocations that resulted in an HTTP error status.
The Properties section of the graph panel lists system property settings for the server: - catalina.base: Home directory for the tc Runtime instance. - date.creation: Date the tc Runtime instance was created. - java.pid: Process ID of the Java VM. - system.net.connecting.ip: - system.net.default.route.ip: - system.net.ip: - system.net.name: Network name of the host computer.
Expand the All Properties list at the bottom to see a complete list of tc Server and Java system properties.
Click on an application in the left APPLICATIONS panel and All Instances in the left INSTANCES panel. The right panel displays information about the selected application and its End Points across all servers in the cluster. The END POINTS panel displays the End Points associated with the application; the same list appears below the name of the application in the left APPLICATIONS panel.
The Vitals section in the graph shows a summary of the health of the application.
Each row in the End Points table represents an end point, which is a receptor for requests. The universe of possible HTTP URLs is unlimited. However, Spring Insight can group requests together based on the controller with which the requests are associated. For each End Point, Spring Insight displays the following information:
- Health: Shows how well the response time metric is kept within a tolerable threshold, where red is less healthy and green is more healthy. See Customizing End Point Thresholds for setting the tolerable threshold.
- End Point: Displays the name of the End Point.
- Throughput Trend: Displays a simple sparkline that shows the recent mean throughput time of the End Point.
- Throughput: Shows how many Traces per Minute (tpm) were executed over the current time window.
- Errors: Shows what percentage of Traces resulted in an Error (HTTP status 500 to 600).
- Response Time : Displays the 99% response time over the given time range. This value is most useful to determine the worst-case request. A value of 115 ms indicates that 99% of the requests completed within 115 milliseconds. The response time of an HTTP request is determined by the full time it takes the container to ship the response to the client, and not just the time spent in a controller.
Click on any individual server in the left INSTANCES panel to see the same information described in the previous step for only the selected server.
Click on a particular End Point, either in the left APPLICATIONS panel or in the right END POINTS table for a particular application:
Spring Insight displays the following detailed information about the End Point:
- The chart shows the throughput, response time, and error rate trends on the same graph. Response time trend refers to the mean response time of the End Point over the time range. Throughput trend refers to the recent mean throughput time of the End Point. Error rate is the percentage of traces resulting in an HTTP error. Click on markers (points) in the chart to view Trace data that occurred during that time range.
- The RESPONSE TIME HISTOGRAM is an interactive graph that shows how many invocations occurred within a given time period. The Y-axis represents the response time of an invocation. The X-axis represents the number of invocations. Using the histogram is an easy way to identify outliers in your data. The longest-running invocations are always at the top of the histogram. If extreme outliers exist, they are indicated by red bars.
- The Health of the End Point is determined by the Response Time for requests made over the given time interval. The response times are broken down into various Health Zones such as frustrated , tolerated or satisfied . Click on a particular Health Zone to see representative traces within that zone.
Click a bar in the histogram or on one of the markers in the Throughput or Response Time Trend graphs. A REPRESENTATIVE TRACES panel shows representative traces for some invocations that occurred during the selected duration.
The REPRESENTATIVE TRACES panel includes similar data as that of the Recent Activity screen. From here it is easy to drill into the shortest or fastest running traces to see what made them different. See Viewing Recent Activity of Your Application for detailed information about traces and trace details.
You can get an overview of recent activity for a particular application and for all applications currently deployed. You can also drill down to the details of particular application event.
From the main Spring Insight dashboard, select the Recent Activity tab:
The Recent Activity screen displays traces from your application. A trace is a breakdown of the activity of a request. The Recent Activity screen answers the question What just happened?
In the application selector drop-down box in the top-right of the Trace History panel, select the name of your application.
Use the application selector to view activity for all applications or for just a single one.
The Trace History panel shows a timeline of recent activity for your application as real-time requests, represented by bars in the chart. Each bar represents all requests that occurred within a time slice, as measured by the chart. The full time range has 60 time slices. The height of the bar is equal to the longest request that occurred during that window.
The Trace History graph shows activity that Spring Insight has monitored over the past N minutes. When a trace is captured by Spring Insight, it shows on the graph as a bar on the right-hand side. As time passes, bars move left until they fall out of the time range. Click on bars to drill deeper.
Click on a bar in the Trace History chart.
The Traces panel displays a list of traces that executed during the window of time (or time slice), and then details about a specific trace. Insight automatically shows details about the first trace in the Trace Details panel. If you click on a different trace, Insight refreshes the details panel with corresponding information.
You can sort the traces in the Traces panel by Duration (how long did the trace take?), Start (when did the trace start?), End Point (what was the request?), or Error (did the request result in an error?)
Click on a trace in the Traces panel to view its details in the Trace Detail panel.
The Trace Detail panel contains a breakdown of a trace’s activity in a tree format. A trace consists of a top-level operation, usually a Web request, and all nested operations.
Spring Insight uses “smart collapse” to determine how to collapse the tree of trace details so that you do not get pages and pages of trace information. You can, of course, expand operations to drill down into their details.
Operations are the fundamental building blocks of traces. An operation can represent a Web request, a transaction, a call to an MVC controller, a file opening, a service request, and so on. Each operation may have other operations nested within it. The nesting structure shows the normal stack-based method invocation pattern.
The operation timing graph shows two pieces of information:
- When the operation executed in relation to the other operations. This information shows whether the operation executed towards the end of the request or the beginning by the location of the green bar in relation to its borders.
- How long the operation took to execute, indicated by the width of the green bar. Because operations can be nested within operations, the green bar shows only how long the particular operation took, note the sum of duration of the nested operations. This way you can scan all the nested operations and find the particular one that took the longest time, based on the width of its green bar.
Click on an operation’s label, or the entire row, to drill further into the operation details. The details of this panel are specific to each type of operation, and contain the finest granularity of collected data.
Click on Filter to filter the trace details shown in the panel. See Filtering Trace Details for additional details about this feature.
Click on Related To to navigate to the Trace’s corresponding End Point or application in the Browse Resources tab. This button is useful when you want to see how similar requests (requests to the same End Point or application) have performed over time.
- A lightbulb icon gives a visual explanation to how each trace is classified as belonging to a specific endpoint. For example, in the screenshot below, the frame representing the
HotelsControllersearch method is selected as the most important frame in the trace. - A bar above the trace detail pane displays details of an http request that resulted in an error.
The buttons above the application selector drop-down box help you control the information you see in the Spring Insight dashboard and perform additional administrative tasks:
- Change the global time range using the first drop-down list. The time range specifies how many minutes worth of data shows up in the Trace History graph.
- Play or pause the graph movement by specifying
pausein the first drop-down list. If the graph is in Play mode, the word
Liveappears under the right hand side of the graph. If you have paused the graph movement, the time when you clicked the pause button appears instead.
- Click the
>>buttons to rewind or fast-forward, based on the specified time ranges. Spring Insight persists all trace information about all your applications to disk, which means you can rewind and look at trace information from when you first began to track the performance of your application.
- In the right-most drop-down list, choose
Refreshto refresh the trace history by reloading all data within the Trace History graph.
Depending on the nature of a specific trace, the list of operations in the corresponding Trace Detail pane might be very long. Spring Insight uses smart collapse, which means it collapses those operations it thinks are unimportant but expands operations which are most likely interesting to you. However, the Trace Details pane could still be very large. In this case, you might want to filter the list of operations so as to display only certain types that you are interested in.
The list of available filters depends on the plug-ins currently installed in Spring Insight as well as the type of operations in the current list of trace details. Spring Insight has a number of plug-ins installed by default. Click the Administration tab on the main dashboard then Collection Plug-Ins to see the list. If you previously added your own custom plug-in to Spring Insight so as to display custom trace details, then this filter might also be available.
The default list of filters is as follows:
- Database: Filters operations based on whether they are related to general database calls. This could include transactions as well as standard calls to a relational database, for example.
- JDBC: Filters operations based on whether they are JDBC calls. The results of this filter are a subset of the results of the Database filter.
- Web: Filters operations based on whether they are related to Web calls, such as HTTP requests and responses, rendering of HTML pages, and so on.
Filters are tri-state—you can set each filter to one of three different states:
- Unselected – operations are included in the results unless one or more other filters are set to + (plus)
- + (plus sign) – only these operations are included in the results
- - (minus sign) - only these operations are excluded from the results
Filters are sticky, which means that they stay in place even when you navigate away from the page in which you set the filter. This is useful if you want to look at multiple applications or End Points, searching for JDBC calls, because you do not have to reset the filter each time you look at a different trace. You can remove the filter at any time, as described in the procedure below.
To apply filters, follow these steps:
- Display the Trace Details pane for a particular trace you are interested in. See Viewing Recent Activity of Your Application.
In the top-right corner of the Trace Details pane, click the Filter drop-down list and select the filters you want to apply. You can select multiple filters; the results are a sum of the individual filter results. Set the filter to + (plus) to include only those operations, - (minus) to exclude the operations, or leave the box unselected. If you do not set a filter to + or -, the operations are included unless another filter is set to +.
The preceding graphic shows how to apply a JDBC filter.
Click to the left of the Filter drop-down list to actually apply the filter. A message appears below the Trace Detail label that alerts that a filter is being applied.
After applying the filter, the trace details navigation tree should get smaller, and the only details that are now included are those related to the specified filter, such as JDBC calls in our example.
Because of the stickiness of the filter, as you navigate away from your current page you’ll see that Spring Insight continues to apply the filter.
To remove the filter, click on the
Xto the left of the
Filter applied to tracealert.
With this feature, you can export a trace to a file and then import it into either the same or different instance of Spring Insight. This feature is useful if the Insight user who captured the trace in real time wants to look at it at a later time, or hand it off to another person.
For example, assume that a QA engineer is testing a Web application and gets an error, or an operation takes too long to complete. The QA engineer uses Spring Insight to take a look at what just happened by navigating to the relevant trace and viewing its details. If the QA engineer decides to open a bug about the problem, they can export the trace to a file and then attach this file to the bug issue so that the developer can take a look at a later date. The developer does not need to actually deploy the application; rather, they can simply look at the trace details to figure out which operation caused the error, where the excessive time occurred, and so on. This is an easy way to capture a complete set of trace information about an error event, even if the event is not reliably reproducible.
You can only import traces that were previously exported from a Spring Insight instance.
To export a trace:
- Display the Trace Details pane for the particular trace you want to export. See View Recent Activity of Your Application.
- From the right-most drop-down list in the Trace Details pane , select Export Trace:
- Save the exported trace to a file. The name of the file will be of the form
To import a trace:
- From the main Spring Insight dashboard, click the Administration tab.
- In the left pane, click Import Trace.
- In the right pane, use the browse button to browse the trace you want to import; the name of the file will be of the form
Spring Insight immediately takes you to the Recent Activity page with the trace and its details displayed.
To locate the trace in the Trace History pane, go back in time to when the trace was originally exported.
The health of an end point is based on how many traces took longer to execute than the response time threshold.
By default, Spring Insight uses a response time threshold of 200ms. In the response time histogram, the upper limit of the y-axis is 4-times the threshold, or 800 ms by default. The time chunks of the y-axis are not evenly distributed, but rather, broken up in a way to show the distribution of the response times of the recent End Point traces.
If, for a given trace or subset of traces, you find that the default threshold is too high or too low, you can change it. For example, if you find that all your response times for the
.*show.* methods are almost always below 100 ms, you might want to set this as the threshold. The histogram will then have a smaller range, and thus show more fine grained information. Similarly, if you have an end point in which the response times are always over 200 ms, the health of this end point will almost always show as
frustrated. If you decide that it is acceptable if a threshold of 300ms is acceptable, then you can change it for this End Point so it will show as appropriately healthy.
In other words, if you change the response time threshold for an End Point, you change the criteria that Spring Insight uses to decide whether it is healthy or not.
The following graphic shows the health and response time histogram for an End Point whose threshold is the default (200ms):
Note the upper limit in the histogram of 800ms, and the
satisfied range is under 200ms. A response time of over 200ms but under 800ms is tolerated, but over 800ms is frustrated, or unhealthy.
Note that all response times are under the default threshold so the End Point is healthy. If the response times are all significantly below the default threshold, it might be a good candidate to lower the threshold to get more fine-grained response time information.
To change the threshold for an End Point or set of End Points:
- From the main Spring Insight dashboard, click the Administration tab.
- Click End Point Threshold in the left pane.
- In the right pane, click New:
In the Rule field, enter a regular expression that corresponds to the End Point or End Points for which you want to change the threshold. For example, if you want to specify all
Enter the new threshold. As described above, the default Spring Insight threshold is
Click Save. The Matching End Points column automatically shows the number of End Points that match this rule; the number is a link. Click on this link to see the list of matching End Points.
Click the Make Permanent to apply the change. The new rule appears in the table:
If you have more than three rules, you can use the up and down arrow buttons to change the order in which Spring Insight applies the rules. Insight applies the rules from first in the list to last. The default rule (
.*) should always be last.
Browse to a trace that matches the rule. Note that the y-axis of the Response Time Histogram now has an upper limit of 4-times the new threshold. The health of the End Point is now satisfied when its response time is below the new threshold. In our example, the new threshold is 100 ms and so the upper limit is 400 ms:
You can configure Spring Insight collection strategies for each Insight Agent, in its properties file, with the properties that are prefixed with
insight.collection. Collection strategies include:
- Endpoints only. Limit a certain percentage of traces to resources that are potential endpoints.
- Prefix instrumentation. Selectively exclude resources from instrumentation at the package, classes, or method level.
- Minimum operation duration. Exclude operations whose duration is less than a specified threshold.
On the Collection Strategies page, available on the Administration tab, you can view the current collection settings across all agents. You can also globally enable or disable a collection strategy from the user interface; for instance you could disable prefix instrumentation for all agents reporting to the dashboard.
If all agents reporting to the dashboard have the same setting for a configuration strategy, the value that is configured is prefaced by the string “global”, as is the case for the endpoints only strategy in the screenshot below. In contrast, note that the minimum operation duration strategy is enabled on one agent and disabled on the other.
To enable or disable a collection strategies for all agents immediately, click the pencil icon next to it on the Collection Strategies page. Note that the Collection Strategies page enables run-time control of collection strategies only. The next time a agent is restarted, it will again adhere to the collection strategy configuration defined in its properties file.
Speed Tracer is a Google Chrome extension that analyzes how your application is performing inside the browser. It measures how long the browser takes to render, transform CSS, show images, process events, and so on.
Although Speed Tracer is a great tool for determining where CPU time is spent within the browser process, it cannot see into what the application itself is doing in the back end. For that, it needs Insight. The two products are now integrated so that you can now see Trace data interleaved with Speed Tracer’s browser timings.
To see Insight Trace data within Speed Tracer for an application deployed to Insight:
- Deploy your application to a tc Runtime instance that is configured with Spring Insight.
- If you have not already done so, download the version of Google Chrome that has been instrumented for Speed Tracer, install the Speed Tracer extension, and launch the Chrome browser with the appropriate flag. For details, see Getting Started with Speed Tracer.
- Open up the Speed Tracer console by clicking on the stopwatch icon in the top-right corner of the Google Chrome browser.
- Using Google Chrome, navigate to a page of your deployed application and perform some action.
- In Speed Tracer, click on the Network (resources) timeline. In the left column, search for resources that have a grey pillbox, with tooltip
Includes timing data from the server; these resources include Insight data along with the standard Speed Tracer data. See the graphic in the next step.
Expand the resource to view the Insight data, listed under the Server Trace section.
This section includes a brief summary of the Trace frame stack and allows easy navigation into various parts of Spring Insight related to the given Trace. To see more detailed information, select the Trace, End Point or Application links, which will jump into Insight at the appropriate location so you can further drill down.