Data Science: CompStak’s Space Insights Rent Prediction

SKIP AHEAD TO

High-Level Workflow
Market-Specific Model Training
Property CompSet and Target Selection
Market-Specific Model Training
Property CompSet and Target Selection
Pairwise Feature Engineering
Market-Specific Model Inference
Property CompSet and Target Selection
Lease Eligibility
Pairwise Feature Engineering
Market-Specific Model Inference
Aggregation and Hybrid Weighting
Model Validation and Evaluation

December 8, 2025

Introduction

Space Insight is CompStak’s hybrid econometric and machine-learning framework for estimating market rent using comparable lease transactions. It blends the transparency of traditional adjustment-based appraisal methods with the consistency and scale of a repeatable, model-driven
process.
The framework is built around five core components:

Property CompSet, a two-stage search-and-ranking pipeline that identifies similar properties and their associated leases
Market-specific adjustment models trained independently in major markets
Pairwise feature engineering that measures structural, temporal, and lease-level differences between the subject and each target lease
A hybrid weighting scheme that balances adjustment magnitude with comparability confidence

Outlier handling to stabilize the final aggregated prediction

Together, these components create a unified process that behaves like an appraiser but with the reproducibility and scale of a machine learning model.

High-Level Workflow

Space Insight takes a subject space (property and lease characteristics) and a set of target leases used as comparables. Targets are sourced from CompStak’s Property CompSet model. Once these targets are identified, Space Insight retrieves their attributes, computes subject–target feature differences, estimates an adjusted rent for each target, filters outliers, and aggregates the remaining adjusted rents into a final estimate for the subject.

Market-Specific Model Training

Space Insight supports markets where CompStak’s data coverage is deep enough to enable reliable modeling. Each market has its own adjustment model trained on historical leases from that market.Feature selection is based on predictive value, stability over time, and interpretability so the model reflects local pricing behavior rather than imposing a uniform national methodology.

Property CompSet and Target Selection

CompStak’s Property CompSet provides a consistent and scalable way to identify which properties, and therefore which leases, should be considered comparable to the subject. The approach combines large-scale vector search with a machine learning model trained on custom compsets historically created by CompStak users. First, the pipeline uses vector search to locate properties similar to the subject across millions of records. Each property is represented using two embeddings: a location embedding that captures spatial proximity, and a feature embedding that reflects building characteristics including property
size, year built, and building class. Modern vector indexing lets the system query these embeddings efficiently, returning candidates in milliseconds even when searching across the full national property database. Market, submarket, and property-type filters are applied during retrieval so the candidate set aligns with relevant local leasing conditions. Next, the pipeline refines the candidate pool using a gradient-boosted model trained on CompStak user behavior. These user-curated comps provide valuable signal about which property attributes matter most in determining comparability. To capture these patterns, property features are transformed into market-normalized percentile values, allowing the model to interpret differences in attributes such as size or floor count appropriately for each market. The refinement model assigns each candidate property a comparability probability score indicating how likely it is to serve as a useful comparable for the subject.

Lease Eligibility

Space Insight applies eligibility checks to ensure that target leases are appropriate for use in pricing. Leases are included only if they fall within a recent lookback window, contain required fields such as execution date, transaction size, and rent, and match the subject’s space type.

Pairwise Feature Engineering

For each target lease, Space Insight computes a set of pairwise features capturing how the subject differs from the target. These features include building size, class, year built, floor count, submarket, floor level, transaction size, space type, sublease status, transaction type, execution date, and the target’s own adjusted market rent. Only features that improve predictive accuracy and remain stable over time are retained.

Market-Specific Model Inference

For each subject/target pair, Space Insight estimates an adjustment using:

Here, ∆ysc represents the predicted difference in rent between the subject lease s and the target lease c. The terms ∆X(i) capture differences in lease-level attributes (such as transaction size or floor), while the terms ∆X(p) capture differences in property-level attributes (such as building floors, age, or submarket). The coefficients β are trained separately for each market. For convenience in later steps, we refer to the target-specific adjustment simply as ∆yc when the subject s is implied. Let rc denote the observed adjusted rent for target lease c. The model combines this observed rent with the predicted adjustment to produce a subject-aligned rent estimate for each target:

This value ˆrc reflects how the target lease would be priced if it matched the subject’s characteristics.

Aggregation and Hybrid Weighting

Once adjusted rents ˆrc are computed for all targets c, Space Insight aggregates them into a single estimate for the subject.

Outlier filtering removes targets whose adjustments fall far outside the interquartile range, preventing extreme observations from distorting the result.

For the remaining targets, the model computes an adjustment-based weight derived from the magnitude of the predicted adjustment. Let ∆yc denote the adjustment for target c. The adjustment-based weight is

so targets requiring smaller adjustments receive larger influence in the final estimate.
Space Insight also incorporates a second weighting signal based on the comparability strength produced by the Property CompSet model. This score reflects how well each target property aligns with the subject property based on CompStak’s two-stage search-and-ranking process.
In practice, the final weight for each target combines these two signals into a single measure of relevance. Targets that both (i) require minimal adjustments and (ii) have strong comparability scores receive the greatest weight.
The subject’s final predicted rent is the weighted average of target-level predictions:

Model Validation and Evaluation

Space Insight was evaluated using backtesting against historical CompStak leases. For each lease iin the validation dataset. Performance was measured using mean absolute percentage error (MAPE):

where yt is the actual rent and ˆyt is the model prediction.
Hybrid weighting consistently outperformed simpler aggregation strategies. Many alternative features were tested; only those that meaningfully improved performance and produced clear, interpretable adjustments were retained. Because validation relies on CompStak’s historical rents, model accuracy improves as the underlying dataset grows. Markets with deeper and more recent lease coverage consistently show lower error rates. As CompStak collects additional transactions over time, the model benefits from a richer and more current training signal, leading to progressively stronger and more reliable predictions.

Figure 1 summarizes validation performance across twelve markets