Pablo Zavala · AI Safety Evaluation · Research Engineering

Cybersecurity Anomaly Detection

A Carnegie Mellon coursework project using Isolation Forests and UMAP on the BETH kernel-level security-events dataset. The course report records 95 percent accuracy; detailed validation materials are available by request.

Course report: Isolation Forests and UMAP over 8M+ kernel-level security events

Coursework, request-only report

Detailed materials are available by request; the public page uses a compact evidence card.

Role: Security analytics coursework: anomaly scoring and dimensionality reduction.

How to Inspect This Work

Dataset scale

The evidence card foregrounds the BETH kernel-event scale because the project is about high-volume security telemetry.

Method

Isolation Forests supply anomaly scores and UMAP provides dimensionality reduction for inspection.

Claim limit

Because the detailed report is not public, the 95 percent accuracy claim should be read as coursework evidence available on request.

Case Study

Problem

Kernel-level security telemetry is too large for manual labeling, which makes anomaly detection a useful testbed for unsupervised methods.

Setup

The coursework uses the BETH dataset of kernel-level security events, with more than eight million records.

Method

The pipeline pairs Isolation Forests for anomaly scoring with UMAP for dimensionality reduction.

Result

The course report records 95 percent accuracy on the dataset.

Limitation

The project is coursework and does not have a public artifact link, so detailed materials are available by request.

Evidence

The public page uses a compact evidence card that summarizes the dataset scale, method, and reported result.

Key Outcomes

  • Unsupervised detection over more than eight million kernel-level security events
  • Reported 95 percent accuracy with Isolation Forests and UMAP

Methods

  • Isolation Forests
  • UMAP
  • Unsupervised anomaly detection