SkyWalking Horizon UI 0.7.0 is released. Go to downloads page to find release tars.
This release broadens what Horizon can observe — browser JS errors with source-map de-obfuscation, an Airflow layer, and a clustered BanyanDB self-observability layer — and deepens how you navigate a deployment: lock-and-compare across entities on a layer dashboard, three new topology drill-downs (instance map, intra-service deployment, API dependency), and portable dashboard config with OAP as the single source of truth.
Browser errors & source maps
- A new Browser Logs tab on the
BROWSERlayer lists the JS error logs the browser agent reports — message, category, page, app version, time, and the minifiedline:col— filterable by category and time window. Expanding a row shows the raw stack alongside a de-obfuscated view. - Source-map de-obfuscation (skywalking#6784): upload a
.mapfile and resolve any error’s minified stack back to the original file, line, column, symbol name, and a source snippet by picking which map to apply. Uploaded maps live in the BFF’s memory only (no backend storage) — surfaced as temporary, LRU-evicted at the configured budget, and lost on restart. For durable provisioning, mount.mapfiles into the server’s static source-map directory (HORIZON_SOURCEMAPS_DIR,/app/sourcemapsin the image); those reload automatically and can’t be deleted from the UI. Budgets are tuned via the newsourceMapsblock inhorizon.yaml(defaults 64 MiB per-file / 512 MiB total). Upload/delete require the newsource-map:writepermission; viewing and resolving ride onbrowser-errors:read.
Two new layers: Airflow & BanyanDB self-observability
- Airflow (SWIP-7), under Workflow Scheduler: a service dashboard (scheduler / executor / pool KPIs and trends), a Components dashboard (per-host scheduler and triggerer metrics for Airflow 3.x native OTel), and a 3D Infra Map load ring for Tasks Executable. Pairs with OAP backend SWIP-7 (
meter_airflow_*). - BanyanDB (SWIP-15), under Self-Observability: models a clustered, role- and tier-aware BanyanDB deployment scraped through its FODC proxy — the cluster is one Cluster (service), each container a Container (instance, carrying its
container_namerole andnode_typetier), and each storage Group an endpoint. It ships a Cluster dashboard (write/query/error-rate, capacity, a Containers-by-Role table), a role-adaptive Container dashboard (shared CPU/memory/Go-runtime resources, plus liaison ingestion / data storage / lifecycle migration panels gated on each container’s role), a per-data-model Group dashboard (measure / stream / trace / property), and a Deployment tab rendering the container inventory with role-pair-specific call edges. The whole deployment model is editable from Layer dashboards admin → Deployment scope. Pairs with OAP backend SWIP-15 (meter_banyandb_*).
Lock & compare entities on a dashboard
- Pin several services, instances, or endpoints — including ones from different services — and compare them in place. Compare is standard on every service / instance / endpoint layer dashboard; nothing to enable. The entity you’re viewing is always part of the comparison, tagged
CURRENTand shown first; pinned entities add to it, up to six, each in its own stable hue. - Each widget compares inline in its own tile — line widgets overlay one series per entity, card widgets show one row each, top-N and record widgets get per-entity tabs plus a merged All tab, and table widgets gain an Entity column. Each entity loads as its own request, so tiles fill in progressively and one slow or failed entity never blanks the others. The Topology, Deployment, trace, and log pages are unaffected.
New topology drill-downs
- Instance map — click a call between two services, then Instance map →, to open each service’s instances as client / server columns with the instance-level calls between them: pan/zoom, animated flow, per-call client/server metric sidebar, and relationship-aware pair pickers drawn from the call graph. Configurable per layer (Layer dashboards → Topology → Enable instance topology), on out of the box for General, Service Mesh, Kubernetes Service, and Cilium Service.
- Deployment tab — the instance-to-instance call graph within a single service (e.g. a clustered store’s nodes calling each other). Instances render as hexagons that bundle into pods (main + sibling containers); cluster them by one or more instance attributes or a name regex; a tiered, draggable layout reads upstream→downstream left-to-right. Edge metrics are keyed by (source-role → target-role) pair, with a primary metric printed inline and a Flows sub-tab tabling every edge per role-pair. Off by default; opt in per layer.
- API dependency tab — an endpoint’s caller → callee chain as a column graph (callers left, focus centre, callees right) with the same health-ring borders, SLA-coloured RPM, and latency as the service map. A single + handle pulls in an endpoint’s own callers and callees to walk the chain; drill-outs open in a new browser tab. Localized across all eight UI languages.
Topology readability
- A new Filter control on the per-layer service map (and the embedded topology widget) hides the conjectured peers that clutter a dense map — faceted by layer (each row carrying the layer’s own icon and localized name), plus a standalone User toggle and an Others bucket for unresolved peers. Filtering is client-side and defaults to showing everything.
- Technology component icons now render on service-map nodes — the same icon set the trace waterfall uses, so a PostgreSQL node looks like PostgreSQL — falling back to the generic glyph when a component ships no icon.
- The topology’s service selector now groups its list by OAP
Service.group; clicking a group header batch-selects or unselects every service in that group.
Service group as a first-class layer axis
- A per-layer Split menu by service group toggle fans a layer into one sidebar entry per OAP
Service.group(the<group>::prefix), each entry scoped to its group across header, picker, topology, dashboards, and roster. The service picker surfaces each service’s group chip, and the navigation sidebar is now resizable (drag the divider, double-click to reset; width persists per browser) so long group-split names stay readable. - Every layer OAP reports now appears in the sidebar, including layers with no Horizon template (they render with default capabilities). A layer is hidden only when an admin disables its template or it is listed in the new config-driven
layers.excludedblock inhorizon.yaml(defaultsFAASandVIRTUAL_GATEWAY).
Dashboard config is portable, and OAP is the source of truth
- Every template admin page — Overview templates, Layer dashboards, the 3D-map config, and the per-locale Translations — now has Export and Import. Export downloads the in-use version (what end users render) as JSON for backup, sharing, or moving a dashboard to another OAP; Import loads a JSON file as a local draft to preview, then publish with Check diff & push. Import never writes OAP directly.
- Runtime config is strictly what’s on OAP. Dashboards, overviews, and topology now render only the version published to OAP’s UI-template store (or the in-code minimal default) — the disk-bundled templates reach a running UI only by being synced to OAP or through the admin Preview button, never as a silent live fallback. An unreachable template store is a visible block (a banner, matching the OAP-query-unreachable strip), not a quiet bundled back-fill. The Preview button now drives every template-rendered page — overview detail, per-layer topology (incl. the instance map), API dependency, traces, and network profiling.
Layer landing shows every service
- The layer landing now probes all services up to a configurable cap (
query.landingServiceCapinhorizon.yaml, default 100) and runs a cheap ranking pass to pick the true top-N by the landing’s order-by column — replacing the old cap of the first 25 by list order, which both hid services and mis-ranked them. The service picker lists the whole layer (below-cap services showlowin the ranked column rather than disappearing), and the header chip reads “metrics: top N” to make the trim explicit. - Selecting a low-traffic, below-cap service now works on every tab — logs, traces, and endpoint-dependency resolve the name from the full roster, not just the landing sample, so a tail service drills in everywhere.
Widgets & formatting
- Layer-dashboard widgets gain a structured Visible when gate — by an MQE expression (has-value, or
>/<a threshold) or by an entity attribute (e.g. language equals JAVA) — evaluated server-side, so a gated-out group’s queries are skipped entirely (a non-JVM instance no longer runs the JVM widget queries at all). - New card formats:
enummaps a coded metric to a readable, per-locale-translatable label (1 → OK), anddurationrenders a SECONDS metric as a human time-ago (5m 20s ago). Record widgets drill into the originating trace — resolved by trace id (so it works across layers) — with click-to-copy statement text. The instance-list badge is now configurable per layer (any attribute instead of the fixed agentlanguage, hidden when empty orUNKNOWN). Large numbers on axes and tooltips use compact SI suffixes (45.1k,1.34M) instead of scientific notation.
Live debugger & DSL apply
- The live debugger groups MAL sample fan-outs by metric into one-line summaries and opens an expanded group straight into a diff view — shared labels dimmed into a “common” block, only the differing labels highlighted — so it’s immediate which label distinguishes each sample. Multiple output entities fold the same way; long fractional
rate()/avg()values are trimmed for display (the exact value stays on hover). - DSL management shows live apply progress. A structural rule change (scope, downsampling, or metric set) now tracks the apply across the cluster through a phase stepper (Compiled → Confirming → Committing → Done) and reports success only once OAP confirms durability. “Applied — cluster propagation unconfirmed” is surfaced as a warning (the rule is applied; lagging nodes self-converge), a failed apply is called out as rolled back with the edit kept for a retry, and a one-click Force re-apply recovers a stuck node.
Access control, performance & fixes
- RBAC: the Roles & Permissions board now lists
infra-3d:read; editing a layer-dashboard template gates ondashboard:write(publishing overview / alert / 3D-map configs stays onoverview:write); the Cluster Status debug view needs onlylive-debug:read; and saving a local draft enforces the same per-kind permission as publishing. - Performance: layer dashboards reuse the warm per-layer service catalog (one fewer OAP round-trip), the alarms list and count fire their two startup probes in parallel, the 3D Infra Map loads metrics in bounded-concurrency batches, and an oversized topology (>5,000 services / 15,000 calls) fails with a clear “too large to render” notice instead of an unreadable map. Partial metric-load failures now surface a banner so a backend hiccup isn’t misread as real “no traffic” data.
- Fixes: the API-dependency tab honors the topbar time picker; one failed metric group no longer blanks an entire dashboard; trace-list rows pick the correct root span on BanyanDB; the server-timezone offset is cached per OAP URL so repointing OAP re-probes immediately; baseline security headers (
X-Content-Type-Options: nosniff,X-Frame-Options: DENY,Referrer-Policy: no-referrer) are sent on every response; and the profiling pages use more of the page height.
Full release notes are here.