—

NFL Big Data Bowl 2025- Pre-snap Pressure Prediction Index

Kaggle notebook here if you prefer to read it that way

Code can be found at my Github here

Introduction

In the modern NFL, the battle between offensive and defensive units begins well before the snap. The 40-second play clock represents a complex chess match where defensive coordinators aim to disguise their pressure packages while offenses attempt to identify and counter potential threats.

Our Pre-snap Pressure Prediction Index (PPPI) introduces a novel approach to quantifying this pre-snap dynamic, offering teams a powerful tool for both strategic planning and real-time decision-making. The PPPI is a probability score ranging from 0 to 1 that indicates the likelihood of the quarterback facing pressure on a given play, based solely on pre-snap information.

PPPI values can be interpreted as follows:

Score Ranges

0.00 – 0.25: Low pressure probability
0.26 – 0.50: Moderate pressure probability
0.51 – 0.75: High pressure probability
0.76 – 1.00: Very high pressure probability

PPPI leverages Next Gen Stats tracking data to analyze defensive formations, player movements, and spatial relationships in the crucial moments before the snap. By combining advanced machine learning techniques with deep football knowledge, PPPI achieves what traditional statistics cannot: predicting quarterback pressure probability based solely on pre-snap information.

This metric has practical value to NFL teams, enabling them to:

Identify optimal protection schemes
Evaluate the effectiveness of defensive pressure disguises
Develop counter-strategies against specific defensive tendencies
Enhance quarterback preparation for recognizing pressure packages

Data and Feature Engineering

Our analysis began by sorting the Next Gen Stats tracking data to focus on all passing plays within the nine-week span of interest. This yielded 9,308 passing plays, which were further classified into those with and without pressure events. Among these, 4,276 plays were identified as having pressure events. For this analysis, pressure events were treated as a binary outcome: 1 if any pressure occurred during the play and 0 otherwise. We did not evaluate the quality of the pressure or account for multiple pressures within a single play, simplifying the modeling problem to a binary classification task.

The foundation of pressure prediction lies in understanding the spatial relationships on the field, as defensive alignments relative to the ball and the offensive line are critical indicators of potential pressure. Similar to how quarterbacks assess defenses pre-snap, we calculated exact defender positions relative to the line of scrimmage and the relative distance between each defender and the offensive line. This approach mirrors the sequential decision-making process of scanning the overall defensive structure before identifying individual player movements. A full list of features used in PPPI calculations is provided in Appendix A, but key features include :

Focus on the “Box” Area

The ‘box’ area, spanning 5 yards behind the line of scrimmage and 4 yards to either side of the center, is critical for identifying pressure development based on the amount of defenders in the box, their positioning, and movement.

Offensive Line Structure

The offensive line’s structure provides equally important information:

Width of the line: Tackle-to-tackle spacing
Gaps between linemen: Identifying vulnerabilities in protection schemes
Formation balance: Assessing the overall setup

Defensive Front Complexity Metric

One of our most innovative features, this metric combines:

Variance in defender depth and width positioning
Spacing between defenders, hinting at potential stunts or twists
Pre-snap speed and acceleration of defenders, telegraphing blitz intentions
Orientation of defenders relative to the line of scrimmage

Situational Context

We incorporate situational factors, recognizing that pressure tendencies vary significantly based on down, distance to the goal line or the first down marker, and the overall game situation. For example, third-and-long scenarios are more likely to feature exotic pressure packages than first-and-10 situations.

Model

Model Evaluation Metric: ROC AUC

We used the Receiver Operating Characteristic Area Under the Curve (ROC AUC) as the primary evaluation metric for its ability to measure how well the model distinguishes between pressure and non-pressure plays across all prediction thresholds.

Why ROC AUC? ROC AUC is well-suited for evaluating pressure prediction for several reasons:

Threshold Independence: It allows teams to set their own risk tolerance without altering the model, as it evaluates performance across all thresholds.
Balanced Tradeoffs: It considers both false positives (unnecessary protection adjustments) and false negatives (missed pressure predictions), ensuring balanced evaluation.
Interpretability: A score of 0.654, for example, means the model ranks pressure plays higher than non-pressure plays 65.4% of the time—a clear benchmark for improvement.

Model Selection

We evaluated several modern machine learning approaches, each offering unique strengths for our problem. Figure 1 plots the ROC curves for all models evaluated, while Figure 2 provides a summary table of their cross-validation (CV) mean ROC AUC scores and final test ROC AUC scores:

Figure 1: ROC curves for all models and their respective AUC scores

Figure 2: Model comparison table showing CV mean ROC AUC scores, test set ROC AUC scores, and error bars for CV standard deviations.

Despite XGBoost achieving a higher cross-validation score, CatBoost was chosen as our production model due to it having the best test set performance (ROC AUC: 0.796) and better precision-recall balance compared to XGBoost which can be seen in Figure 3 below.

Figure 3: A detailed comparison of CatBoost and XGBoost on metrics like accuracy, recall, and precision.

For a deeper dive into the training methodology, class imbalance strategies, and model optimization, refer to Appendix B. This appendix outlines our comprehensive approach to ensure statistical rigor and football-relevant predictions.

Feature Importance Analysis

Our SHAP (Shapley Additive Explanations) analysis revealed critical insights into what drives pressure prediction in the NFL, highlighting both the magnitude and direction of each feature’s impact on pressure probability (Figure 4). The features fall into three categories:

Primary Pressure Indicators
- Rusher Acceleration is the strongest predictor, with higher acceleration consistently correlating with increased pressure probability.
- Number of Pass Rushers shows a bidirectional impact, where effectiveness depends on situational context.
- Number of Blockers demonstrates a symmetric distribution, emphasizing that raw numbers alone don’t determine outcomes; context is key.
Defensive Formation Indicators
- Edge and Line Defenders show moderately positive SHAP values, with positioning being more important than mere presence.
- Box Defenders and Defensive Spread provide consistent signals, suggesting that compact defensive setups can better predict pressure.
Situational and Strategic Factors
- Down Pressure and Down provide steady baseline signals.
- Offensive Line Balance is mostly neutral but occasionally highly influential.
- Yards to Go and Situation Severity are context-dependent, with symmetric SHAP distributions.

The SHAP plot also reveals nonlinear relationships, such as rusher acceleration’s pronounced impact at high values (red dots on the right). Similarly, the number of pass rushers shows complex dynamics, where both extremes—very high or very low—can significantly affect predictions.

These insights demonstrate that while pre-snap movement, particularly rusher acceleration, is the most influential factor, pressure prediction relies on a nuanced interplay of personnel, positioning, and context. Success requires evaluating multiple factors in combination rather than relying on any single indicator.

Figure 4: SHAP Summary Plot

PPPI Distribution and Reliability¶

PPPI demonstrates both strong predictive power and practical reliability, making it a valuable tool for in-game decision-making. On the test set, the model achieved a ROC AUC of 0.796, indicating a high ability to distinguish between pressure and non-pressure plays. Precision and recall for the true class (pressure plays) were 0.76 and 0.66, respectively, striking a strong balance between accurate predictions and minimizing false negatives.

One of the most compelling aspects of PPPI is its ability to stratify plays into meaningful risk categories through quartile analysis. Plays were divided into four quartiles based on their PPPI scores, and actual pressure rates correlated strongly with these groupings:

Q1 (Bottom 25%, PPPI < 0.333): 15.4% pressure rate.
Q2 (25-50%, PPPI 0.333–0.413): 25.7% pressure rate.
Q3 (50-75%, PPPI 0.413–0.501): 37.5% pressure rate.
Q4 (Top 25%, PPPI > 0.501): 54.5% pressure rate.

This clear progression, where pressure rates more than triple from the lowest to the highest quartile, demonstrates PPPI’s power to quantify defensive pressure risk with precision. Teams can leverage these quartiles to tailor their strategies, using low-risk situations for standard protection and high-risk scenarios to trigger quick-game adjustments or max protection schemes.

Model Stability

Beyond its predictive strength, PPPI exhibits excellent stability across the dataset. The model’s output has a controlled spread, with a mean PPPI score of 0.422, a standard deviation of 0.130, and a range of 0.052 to 0.881. This distribution ensures that the index provides reliable and consistent predictions without extreme outliers, which is essential for actionable, real-time decision-making.

The stability of PPPI means that coaches and analysts can trust the model to deliver dependable insights across various game situations. Whether in pre-game planning or high-pressure in-game adjustments, the index offers consistent performance, enabling teams to build strategies with confidence.

Examples

Looking at the two plays in Figures 5 and 6, PPPI effectively captures defensive positioning and its likelihood to generate pressure. In the high PPPI example (Figure 5: NO vs ATL, PPPI 0.881), the defense exhibits an aggressive alignment with multiple defenders near the line of scrimmage and a compact formation, indicating a strong likelihood of pressure. In contrast, the low PPPI play (Figure 6: NYJ vs CIN, PPPI 0.052) features a conservative alignment, with defenders spread out and positioned 10–15 yards off the line, signaling a coverage-focused scheme.

The stark difference in PPPI scores (0.881 vs 0.052) reflects the defense’s structural intent, quantifying what coaches and players observe pre-snap. This demonstrates how PPPI translates complex defensive alignments into a single, actionable metric for predicting quarterback pressure.

Figure 5: Example of a play with a high PPPI score (Game ID: 2022091100 Play ID: 2555)

Figure 6: Example of a play with a low PPPI score (Game ID: 2022092506 Play ID: 2398)

Analysis and Implications

Defensive Tendencies and Applications

The PPPI model highlights pre-snap defender acceleration as the strongest predictor of pressure, revealing a strategic tradeoff: subtle motion can disguise blitzes, but excessive movement often signals pressure, providing offenses opportunities to adjust. Additionally, the rusher-to-blocker ratio, paired with spatial alignment data, refines the traditional “hat count” approach by emphasizing positioning and intent. Edge defenders consistently emerge as top contributors, underscoring their growing influence in modern NFL schemes.

PPPI insights inform both pre-game planning and in-game adjustments for offenses and defenses. For offenses, low PPPI scores (<0.25) suggest standard protection schemes are sufficient, while moderate scores (0.25–0.50) call for situational adjustments, and high scores (>0.50) warrant max protection or quick-game concepts. Motion and shifts can further neutralize high-risk scenarios.

For defenses, controlling pre-snap acceleration and optimizing edge alignment improve pressure disguises and maintain unpredictability. PPPI helps refine strategies, enabling teams to dynamically adapt during games to create optimal pressure scenarios.

Limitations and Considerations

PPPI has some limitations. It does not account for individual player skill differences or the quality of pressure, and team-specific tendencies may diverge from league-wide patterns. These factors can reduce its generalizability and impact in certain contexts.

Conclusion

The Pre-snap Pressure Prediction Index (PPPI) represents a significant advancement in quantifying defensive pressure before the snap, combining traditional football insights with modern machine learning. By validating long-held beliefs and uncovering new patterns—such as the role of pre-snap acceleration—PPPI provides a deeper understanding of how pressure develops and actionable strategies for teams to optimize offensive and defensive play.

With its quartile-based framework and stable predictions, PPPI offers practical, real-time applications, empowering teams to make informed adjustments during games. Looking ahead, integrating player-specific metrics and extending the model to other pre-snap events could enhance its utility, while counter-strategy recommendations based on PPPI insights hold promise for advancing game planning.

Ultimately, PPPI demonstrates how analytics can complement traditional football understanding, delivering actionable insights and elevating football strategy in a data-driven era

Appendix A: Full List of Features Used in PPPI Calculation¶

Defensive Position Features

Basic Position Metrics
- distance_from_ball_x: Distance from ball along length of field
- distance_from_ball_y: Distance from ball along width of field
- in_box: Binary indicator if defender is in the box (5 yards depth, 4 yards width)
- on_line: Binary indicator if defender is on the line (within 1 yard)
- edge_position: Binary indicator if defender is in edge position
Aggregate Position Stats
- avg_depth, depth_std: Mean and standard deviation of defenders’ depths from ball
  - closest_defender, deepest_defender: Minimum and maximum defender depths
  - avg_width, width_std: Mean and standard deviation of defenders’ lateral positions
  - nearest_width, widest_width: Minimum and maximum lateral positions
  - box_defenders: Count of defenders in the box
  - line_defenders: Count of defenders on the line
  - edge_defenders: Count of defenders in edge positions ### Movement Features
Speed and Acceleration
- avg_speed, max_speed: Mean and maximum speed of defenders (yards/sec)
  - avg_accel, max_accel: Mean and maximum acceleration of defenders (yards/sec²)
- rushers_with_speed: Count of defenders with speed > 2 yards/sec
- rushers_accelerating: Count of defenders with acceleration > 1 yard/sec²
Direction and Orientation
- avg_orientation, orientation_std: Mean and standard deviation of defender orientations (degrees)
- avg_direction, direction_std: Mean and standard deviation of defender movement directions (degrees)

Offensive Line Features

Line Structure
- ol_width: Width of offensive line (tackle to tackle)
- ol_depth: Average depth of offensive line
- ol_spacing: Average spacing between linemen
- ol_balanced: Binary (0/1) if offensive line is centered (mean y-coordinate within 2 yards of 26.65)
- tight_end_attached: Presence of attached tight end

Matchup Features

Personnel Ratios
- num_pass_rushers: Count of initial pass rushers
- num_blockers: Count of blockers
- rusher_to_blocker_ratio: Ratio of rushers to blockers

Clustering Features

Defensive Formation
- min_defender_spacing: Minimum distance between defenders
- avg_defender_spacing: Average distance between defenders
- front_7_depth_variance: Variance in depth of front 7
- front_7_width_variance: Variance in width of front 7

Feature Interactions

Basic Interactions
- speed_depth_interaction: Product of avg_speed and avg_depth
- box_pressure_potential: Product of box_defenders and avg_speed
- edge_speed_threat: Product of edge_defenders and max_speed
Complex Metrics
- defensive_spread: Product of width_std and depth_std
- box_density: Ratio of box_defenders to width_std
- front_complexity: Composite metric of line_defenders × edge_defenders × width_std × (1 + avg_orientation/180)
- rusher_effectiveness: Composite metric combining num_pass_rushers, avg_speed, and spacing
- defensive_momentum: Product of avg_speed × avg_accel with depth and orientation factors
- edge_pressure_potential: Composite metric of edge pressure considering OL spacing
- spacing_effectiveness: Composite metric of defender spacing effectiveness
- ol_pressure_index: Stress metric based on rusher count and OL width

Situational Features

Game Context
- down_pressure: Enhanced down importance (×1.5 for 3rd/4th down)
- situation_severity: Composite metric of down_pressure, yardsToGo, and rusher count
- pressure_situation:Binary (0/1) for critical situations (3rd/4th and long, 2nd and very long)
- situational_pressure_intensity: Enhanced metric combining situation and defensive momentum

Appendix B: Model Training and Validation Strategy¶

Our training methodology was designed to ensure both statistical rigor and football-relevant predictions. The process combines careful data partitioning, class imbalance handling, and sophisticated model optimization.

Data Preparation and Splitting We begin with an 80-20 train-test split of our play data (~7,446 training plays, ~1,862 test plays), stratified to maintain consistent pressure rates across both sets. This initial split is crucial as it provides:

A large enough training set to learn complex pressure patterns
A realistic hold-out test set for final evaluation
Preserved pressure rate distributions in both sets

Class Imbalance Handling To address the slight natural imbalance in pressure plays, we implement SMOTETomek resampling on the training data only. This approach:

Creates synthetic examples of pressure plays
Removes ambiguous samples near decision boundaries
Maintains a realistic, unbalanced test set
Prevents overfitting to the majority class

Model Training Optimization The model implementation we ended up using includes several key optimizations:

10,000 iterations with early stopping (100-iteration patience)
Conservative learning rate of 0.004 for stable convergence
Bayesian bootstrapping (temperature = 0.7) for robust sampling
Maximum tree depth of 7 with up to 115 leaves for complex pattern capture

Regularization Framework To prevent overfitting while maintaining predictive power, we employ:

L2 regularization (l2_leaf_reg = 1.8)
Feature subsampling (colsample_bylevel = 0.73)
Minimum leaf size requirements (min_data_in_leaf = 50)

Validation Strategy Model performance is evaluated through:

10-fold cross-validation on the training set
Stratified folds to maintain pressure rates
Independent balancing for each validation fold
Consistent metrics across all evaluations
Final verification on the untouched test set

Jack Vogelgesang