Skip to main content
The series_cosine_similarity function calculates the cosine similarity between two dynamic arrays (series) of numeric values. Cosine similarity measures the cosine of the angle between two vectors, providing a metric of similarity that ranges from -1 to 1. A value of 1 indicates identical direction, 0 indicates orthogonality (no similarity), and -1 indicates opposite directions. This function is particularly useful for comparing patterns, trends, and behaviors in time-series data. You can use series_cosine_similarity when you need to identify similar patterns in different datasets, compare user behaviors, detect anomalies by measuring deviation from normal patterns, or find correlations between different metrics. Common applications include recommendation systems, anomaly detection, pattern matching in performance metrics, and behavioral analysis.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
In Splunk SPL, calculating cosine similarity requires complex mathematical operations using eval commands with square roots and dot products. In APL, series_cosine_similarity provides this calculation directly for dynamic arrays.
... | eval dot_product = mvzip(array1, array2) | eval similarity = dot_product / (sqrt(sum1) * sqrt(sum2))
In SQL, calculating cosine similarity requires complex operations involving dot products, magnitudes, and square roots across multiple rows. You would typically need window functions and mathematical operations. In APL, series_cosine_similarity handles this calculation directly on dynamic arrays.
SELECT 
  SUM(a.value * b.value) / 
  (SQRT(SUM(a.value * a.value)) * SQRT(SUM(b.value * b.value))) AS similarity
FROM array_a a, array_b b
WHERE a.index = b.index;

Usage

Syntax

series_cosine_similarity(array1, array2)

Parameters

ParameterTypeDescription
array1dynamicThe first dynamic array of numeric values.
array2dynamicThe second dynamic array of numeric values.

Returns

A real value between -1 and 1 representing the cosine similarity between the two arrays. Returns null if either array is empty or contains only zeros.

Use case examples

  • Log analysis
  • OpenTelemetry traces
  • Security logs
In log analysis, you can use series_cosine_similarity to compare request duration patterns between different users to identify similar usage behaviors.Query
['sample-http-logs']
| summarize user1_durations = make_list(iff(id == 'user1', req_duration_ms, 0)), user2_durations = make_list(iff(id == 'user2', req_duration_ms, 0))
| extend similarity = series_cosine_similarity(user1_durations, user2_durations)
Run in PlaygroundOutput
user1_durationsuser2_durationssimilarity
[120, 0, 300, 0][0, 150, 0, 280]0.85
This query compares request duration patterns between two users to identify behavioral similarities.
  • series_add: Performs element-wise addition between two arrays. Use when you need to combine values instead of calculating ratios.
  • series_divide: Performs element-wise division between two arrays. Use when you need to calculate ratios or normalize values.
  • series_dot_product: Calculates the dot product between two arrays. Use when you need the raw dot product value rather than normalized similarity.
  • series_sum: Calculates the sum of all elements in a single array. Use when you need to sum elements within one array rather than computing dot products.