User Guide
This guide provides a more in-depth look at how to use Pattern Analyzer's various features and interfaces.
1. Command-Line Interface (CLI)
The patternanalyzer CLI is the primary tool for automated analysis. The main command is analyze.
Basic Usage
patternanalyzer analyze <input_file> [options]
Key Options
-o, --out <path>: Specifies the path for the output JSON report. Defaults toreport.json.-c, --config <path>: Path to aJSONorYAMLconfiguration file to customize the analysis pipeline.--profile <name>: Use a built-in analysis profile. Available profiles:quick,nist,crypto,full. This is an easy way to run a focused set of tests.--xor-value <0-255>: Applies a single-byte XOR transformation to the data before running tests.--html-report <path>: Generates a standalone HTML report in addition to the JSON output.
Example with a profile and HTML report:
patternanalyzer analyze suspicious.dat --profile crypto --html-report report.html
2. Configuration Files
For full control over the analysis, you can use a configuration file in YAML or JSON format.
Structure
A configuration file can specify transforms, tests, and global settings.
Example (config.yml):
# 1. A list of transformations to apply in sequence
transforms:
- name: xor_const
params:
xor_value: 85 # 0x55 in decimal
# 2. A list of tests to run on the transformed data
tests:
- name: monobit
- name: runs
params:
min_bits: 100 # Custom parameter for the runs test
- name: ecb_detector
params:
block_size: 16
# 3. Global settings
fdr_q: 0.05 # False Discovery Rate significance level (q-value)
To use this file:
patternanalyzer analyze my_file.bin --config config.yml
3. Web User Interface (Streamlit)
The web UI provides an interactive way to upload files, select tests and transforms, and view results, including visualizations.
To launch the Web UI:
streamlit run app.py
Or, if [ui] extras are installed:
patternanalyzer serve-ui
Navigate to the URL shown in your terminal to access the interface.
4. Interpreting the Results
The JSON output from an analysis contains three main sections: results, scorecard, and meta.
results: A list where each item is the detailed output of a single test plugin. Key fields include:test_name: The name of the test.status:completed,skipped, orerror.p_value: The p-value from the statistical test. A low p-value (e.g., < 0.01) suggests the data is not random according to this test. A value ofnullmeans the test is diagnostic and doesn't produce a p-value.fdr_rejected:trueif the test's p-value was deemed significant after correcting for multiple comparisons (False Discovery Rate). This is the primary indicator of a "failed" test.-
metrics: A dictionary of test-specific measurements and statistics. -
scorecard: A high-level summary of the entire analysis. failed_tests: The number of tests wherefdr_rejectedwastrue. This is the most important summary metric.total_tests: Total number of tests that were run.-
p_value_distribution: Statistics on the distribution of p-values from all statistical tests. -
meta: Information about the analysis environment, including Python version, library versions, and a hash of the input data for reproducibility.