Definition of Mortgage Backed Securities
Programmatic Guide to the Definition of Mortgage Backed Securities
A mortgage-backed security (MBS) is not just another bond; it's a data-driven financial instrument engineered by pooling thousands of individual home loans and converting their cash flows into tradable securities. For structured-finance analysts and quants, the practical definition of mortgage backed securities lies in their data lineage and programmatic accessibility. Understanding the securitization process—bundling mortgages into a trust that issues bonds—is fundamental. This guide breaks down the technical structure, data sources, and analytical workflows essential for monitoring MBS performance, moving beyond surface-level explainers to provide verifiable, reproducible insights. You can visualize the end result of this data on platforms like Dealcharts.
MBS Market & Context Overview
At its core, an MBS is a bond that entitles investors to the principal and interest payments from a pool of underlying mortgages. This mechanism transforms illiquid, individual loans into highly liquid instruments, a cornerstone of modern capital markets. By 2021, the U.S. MBS market was colossal, with over $12 trillion outstanding and daily trading volumes often exceeding $300 billion. This scale makes it one of the deepest fixed-income markets globally. For a deeper dive, you can explore the history of MBS and its market growth to grasp the scale.

For an analyst, the key insight is that securitization bifurcates the mortgage ecosystem: lenders originate loans, and investors assume the associated interest rate, prepayment, and credit risks. This process creates two distinct classes of MBS:
- Agency MBS: Guaranteed by government-sponsored enterprises (GSEs) like Fannie Mae and Freddie Mac, or agencies like Ginnie Mae. The government guarantee effectively eliminates credit risk, making prepayment risk the primary analytical focus.
- Non-Agency MBS (Private-Label): Issued by private entities without a government backstop. Investors are exposed to both credit and prepayment risk, which is mitigated through internal credit enhancements like subordination and overcollateralization detailed in the deal's prospectus.
The 2008 financial crisis, largely fueled by risks in the non-agency MBS market, led to strict new reporting regulations like Regulation AB II. For analysts today, this legacy mandates a rigorous focus on data quality, verifiable lineage, and a healthy skepticism of overly complex structures.
The Data and Technical Angle of MBS
An MBS is a data instrument. Its value, risk, and performance are encoded in the thousands of data points originating from its underlying loan pool. Tracing this information from source filings to analytical models is the core discipline of MBS analysis. The entire data lineage begins with public filings mandated by the Securities and Exchange Commission (SEC).
- 424B5 Prospectus: The deal's blueprint, filed at issuance. It contains the transaction structure, initial collateral characteristics, payment waterfall logic, and key counterparties.
- 10-D Remittance Report: The ongoing, periodic performance report (monthly or quarterly). It provides critical remittance data: principal and interest collections, delinquencies, defaults, and realized losses.
- Exhibit Files (EX-102, EX-103): The machine-readable goldmine. Embedded within 10-D filings, these exhibits contain raw servicer and trustee report data, often in XML, providing the granular numbers required for programmatic analysis.
The primary data challenge is accessing, parsing, and linking this information programmatically. The workflow involves mapping a deal's CUSIPs to the issuing trust's Central Index Key (CIK) on the SEC's EDGAR system, fetching the relevant filings, and writing scripts to extract structured data from inconsistent exhibit formats. Platforms like Dealcharts are designed to solve this data engineering problem by providing a pre-linked context graph of deals, filings, and counterparties via an API.
Example Workflow: Programmatic Surveillance
Effective MBS surveillance is not a manual process of reading PDFs; it is a systematic, programmatic workflow for extracting and monitoring collateral performance at scale. The objective is to transform raw, unstructured data from 10-D filings into clean, actionable metrics like delinquency rates, prepayment speeds, and credit losses.
Let's illustrate with a Python snippet to extract the 60+ day delinquency rate for a specific MBS trust directly from an SEC filing. This code demonstrates clear data lineage: from the source URL to the final, calculated insight.
import requestsimport xml.etree.ElementTree as ET# --- Step 1: Define the Source (Data Lineage) ---# Pinpoint the exact SEC filing using the issuer's CIK and the filing's accession number.# This makes the data source explicit and verifiable.# Example: Wells Fargo Commercial Mortgage Trust 2021-C60, 10-D filed on 04/20/2023cik = "0001869899"accession_number = "0001539497-23-000961"filing_url = f"https://www.sec.gov/Archives/edgar/data/{cik}/{accession_number.replace('-', '')}/{accession_number}.txt"# --- Step 2: Fetch and Parse the Data ---# Retrieve the raw filing text and extract the structured XML data (Exhibit 102/103).response = requests.get(filing_url, headers={'User-Agent': 'Analyst Firm analyst@firm.com'})filing_text = response.text# Isolate the XML portion containing the remittance datatry:start_xml = filing_text.index('<XML>')end_xml = filing_text.index('</XML>') + len('</XML>')xml_content = filing_text[start_xml:end_xml]root = ET.fromstring(xml_content)# Define XML namespace to correctly locate elementsns = {'ns': 'http://www.sec.gov/edgar/document/absee/autoloan/2023-01-01'} # Note: Namespace can vary# --- Step 3: Transform to Insight ---# Extract specific data points using XPath and calculate the delinquency rate.# Note: XML tags are specific to each deal/servicer report. This is a hypothetical example.# For a real CMBS deal, you'd parse CREFC/RegABII XML formats.sixty_plus_delinquent_balance = float(root.find('.//ns:currentBalanceAmount60to89DaysDelinquent', ns).text)total_pool_balance = float(root.find('.//ns:assetPoolCurrentPrincipalBalanceAmount', ns).text)delinquency_rate = (sixty_plus_delinquent_balance / total_pool_balance) * 100print(f"Source: {filing_url}")print(f"Derived Insight: 60+ Day Delinquency Rate = {delinquency_rate:.2f}%")except (ValueError, AttributeError):print(f"Could not find or parse XML data in filing: {filing_url}")
This workflow creates a transparent, reproducible data pipeline. Any analyst can execute the code, verify the source, and understand precisely how the metric was derived, building trust in the resulting models and conclusions. To see a real-world deal structure, you can explore WFCM 2021-C60 on Dealcharts.
Implications for Modeling and Risk Monitoring
A programmatic, data-lineage-driven approach to MBS analysis fundamentally enhances analytical capabilities. When models are built on verifiable data pipelines, their outputs become more explainable and defensible. This is critical for risk monitoring, where understanding why a metric changed (e.g., a spike in delinquencies) is as important as the change itself. Tracing an anomaly back to a specific servicer report or loan pool provides the context needed for accurate diagnosis.
For AI and LLM applications, this structured context is even more crucial. A context engine that can link a CUSIP to its issuing deal, retrieve the latest 10-D, parse the delinquency data, and cite the source filing transforms a generic model into a specialized financial analyst. This "model-in-context" approach, a core theme of CMD+RVL, ensures that automated insights are grounded in verifiable reality, moving beyond statistical correlation to causal explanation.
How Dealcharts Helps
Building and maintaining the data pipelines required for programmatic MBS surveillance is a significant data engineering challenge. Dealcharts is an open context graph designed to solve this problem by connecting the disparate datasets central to structured finance. It automates the ingestion and linking of SEC filings, deals, shelves, tranches, and counterparties. Instead of wrestling with data extraction and mapping, analysts can access clean, structured, and cross-referenced information directly through an API. This allows teams to focus on analysis and model building, not data plumbing, while ensuring every data point is traceable to its source.
Conclusion
The modern definition of a mortgage-backed security is inseparable from its data. For quants, data engineers, and AI professionals, an MBS is a dynamic data instrument whose behavior is governed by verifiable inputs from sources like the 424B5 prospectus and monthly 10-D reports. By adopting programmatic workflows with a strict focus on data lineage, analysts can build explainable, reproducible models that provide a true competitive edge. This approach, which forms the foundation of frameworks like CMD+RVL, is the future of credible, automated financial analytics.
A Few Common Questions
Agency vs. Non-Agency MBS: What's the Real Difference?
The primary distinction is the presence of a credit guarantee. Agency MBS are guaranteed by GSEs, which effectively eliminates credit risk for the investor, making prepayment risk the main variable. Non-Agency MBS lack this guarantee and rely on internal credit enhancements (e.g., subordination) built into the deal structure. Analyzing non-agency deals requires a thorough credit analysis of the underlying collateral and the deal's payment waterfall.
What's the Big Deal with Prepayment Risk?
Prepayment risk is the uncertainty that homeowners will pay off their mortgages earlier than scheduled, typically due to refinancing. When interest rates fall, homeowners refinance, and MBS investors receive their principal back sooner than expected. They must then reinvest this capital at lower prevailing rates, reducing their overall return. This uncertainty, known as contraction risk, makes MBS valuation more complex than that of a standard bond with a fixed maturity.
Where Do I Find the Raw Loan-Level Data?
For post-2014 deals subject to Regulation AB II, issuers must file detailed, loan-level data with the SEC via Form ABS-EE. This standardized XML file, found on EDGAR, contains granular information for every loan in the pool. For older deals, obtaining this data can be more difficult, often requiring access to historical servicer reports or third-party data providers. Programmatic access to these datasets is a core competency for quantitative MBS analysis.
Article created using Outrank