series_cosine_similarity function calculates the cosine similarity between two dynamic arrays (series) of numeric values. Cosine similarity measures the cosine of the angle between two vectors, providing a metric of similarity that ranges from -1 to 1. A value of 1 indicates identical direction, 0 indicates orthogonality (no similarity), and -1 indicates opposite directions. This function is particularly useful for comparing patterns, trends, and behaviors in time-series data.
You can use series_cosine_similarity when you need to identify similar patterns in different datasets, compare user behaviors, detect anomalies by measuring deviation from normal patterns, or find correlations between different metrics. Common applications include recommendation systems, anomaly detection, pattern matching in performance metrics, and behavioral analysis.
For users of other query languages
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.Splunk SPL users
Splunk SPL users
In Splunk SPL, calculating cosine similarity requires complex mathematical operations using
eval commands with square roots and dot products. In APL, series_cosine_similarity provides this calculation directly for dynamic arrays.ANSI SQL users
ANSI SQL users
In SQL, calculating cosine similarity requires complex operations involving dot products, magnitudes, and square roots across multiple rows. You would typically need window functions and mathematical operations. In APL,
series_cosine_similarity handles this calculation directly on dynamic arrays.Usage
Syntax
Parameters
| Parameter | Type | Description |
|---|---|---|
array1 | dynamic | The first dynamic array of numeric values. |
array2 | dynamic | The second dynamic array of numeric values. |
Returns
Areal value between -1 and 1 representing the cosine similarity between the two arrays. Returns null if either array is empty or contains only zeros.
Use case examples
- Log analysis
- OpenTelemetry traces
- Security logs
In log analysis, you can use Run in PlaygroundOutput
This query compares request duration patterns between two users to identify behavioral similarities.
series_cosine_similarity to compare request duration patterns between different users to identify similar usage behaviors.Query| user1_durations | user2_durations | similarity |
|---|---|---|
| [120, 0, 300, 0] | [0, 150, 0, 280] | 0.85 |
List of related functions
- series_add: Performs element-wise addition between two arrays. Use when you need to combine values instead of calculating ratios.
- series_divide: Performs element-wise division between two arrays. Use when you need to calculate ratios or normalize values.
- series_dot_product: Calculates the dot product between two arrays. Use when you need the raw dot product value rather than normalized similarity.
- series_sum: Calculates the sum of all elements in a single array. Use when you need to sum elements within one array rather than computing dot products.