Dataset scale
The evidence card foregrounds the BETH kernel-event scale because the project is about high-volume security telemetry.
Pablo Zavala · AI Safety Evaluation · Research Engineering
A Carnegie Mellon coursework project using Isolation Forests and UMAP on the BETH kernel-level security-events dataset. The course report records 95 percent accuracy; detailed validation materials are available by request.
Course report: Isolation Forests and UMAP over 8M+ kernel-level security events
Detailed materials are available by request; the public page uses a compact evidence card.
Role: Security analytics coursework: anomaly scoring and dimensionality reduction.
The evidence card foregrounds the BETH kernel-event scale because the project is about high-volume security telemetry.
Isolation Forests supply anomaly scores and UMAP provides dimensionality reduction for inspection.
Because the detailed report is not public, the 95 percent accuracy claim should be read as coursework evidence available on request.
Kernel-level security telemetry is too large for manual labeling, which makes anomaly detection a useful testbed for unsupervised methods.
The coursework uses the BETH dataset of kernel-level security events, with more than eight million records.
The pipeline pairs Isolation Forests for anomaly scoring with UMAP for dimensionality reduction.
The course report records 95 percent accuracy on the dataset.
The project is coursework and does not have a public artifact link, so detailed materials are available by request.
The public page uses a compact evidence card that summarizes the dataset scale, method, and reported result.