Technical writing
SEC Form 4 Insider Trading: The Federal Database Behind Corporate Insider Stock Transactions
SEC Form 4 filings are the mandatory disclosure every corporate officer, director, and large shareholder must submit within two business days of any transaction in company stock — creating a real-time public record of insider buying and selling at every US public company, covering 4 million+ filings in the EDGAR database and forming the foundation of academic research on whether insider trading signals future stock performance.
What Form 4 Is
Form 4, formally titled the “Statement of Changes in Beneficial Ownership,” is the primary ongoing disclosure instrument created by Section 16(a) of the Securities Exchange Act of 1934. It records every transaction that alters the beneficial ownership position of a corporate insider in the equity securities of their company — purchases, sales, equity awards, option exercises, gifts, tax withholdings, and a range of other transaction types — and must be filed electronically with the SEC on EDGAR within two business days of the transaction date.
The two-business-day deadline is a product of the Sarbanes-Oxley Act of 2002. Before SOX, the prior Section 16 regime allowed insiders to report their transactions on a monthly basis, with reports due within ten days after the month's close. That 40-day window between transaction and disclosure gave insiders significant latitude to trade and remain silent during the most information-sensitive periods. Congress shortened the window to two business days — enacted as Section 403 of SOX — after the Enron scandal made visible the volume of executive stock sales that had occurred while employees held worthless 401(k) balances. Electronic EDGAR filing became mandatory for all Section 16 reporters in August 2004, converting Form 4 from a paper-based periodic disclosure into a near-real-time public feed.
The statutory structure also creates two companion forms. Form 3 is the “initial statement of beneficial ownership,” filed within ten days of becoming a Section 16 reporter for the first time at a given company — on assuming an officer role, joining the board, or crossing the ten-percent ownership threshold. Form 3 captures the opening position. Form 5 is the annual statement, due within 45 days after the fiscal year end, covering any transactions exempt from or inadvertently omitted from the two-day Form 4 requirement. In practice, Form 4 is the dominant instrument; Forms 3 and 5 exist at the margin, and essentially all practitioner and academic analysis of insider activity is built on the Form 4 corpus.
Who Must File
Section 16(a) imposes disclosure obligations on three categories of persons, collectively known as “Section 16 insiders”:
- Officers. The SEC defines “officer” for Section 16 purposes in Rule 16a-1(f): the president, principal financial officer, principal accounting officer, any vice president in charge of a principal business unit or function, and any other person who performs a policy-making function for the issuer. This definition is deliberately broad. A vice president of a small division at a major company may be a Section 16 officer if the company determines that the role involves policy-making. The company is responsible for identifying who qualifies and notifying those persons of their obligations. Misidentification — failing to designate someone who legally qualifies — does not relieve the individual of liability, but Section 16(a) enforcement is generally pursued by the SEC on a complaint-driven rather than proactive audit basis.
- Directors. Every member of the board of directors of the issuer, including independent directors who own no shares at the time of their appointment. A director who serves on multiple public company boards has separate Section 16 obligations at each company and must file at each. Directors who hold no shares and engage in no transactions must still file Form 3 upon joining the board; their ongoing Form 4 obligations arise only when they transact.
- Ten-percent beneficial owners. Any person who beneficially owns more than ten percent of any registered class of equity securities. Unlike the officer and director categories, which are defined by role regardless of share ownership, ten-percent owner status is purely quantitative and can shift as the person transacts or as other shareholders enter and exit. An activist investor who builds a stake above ten percent becomes subject to Section 16 for any further transactions. The ten-percent threshold is measured against the number of shares outstanding of the class, not against the insider's total portfolio value or a different securities class.
The reporting obligation attaches only to transactions involving equity securities registered under Section 12 of the Exchange Act, which covers listed companies and over-the-counter issuers above the registration threshold (generally companies with more than $10 million in assets and a class of equity held by more than 2,000 record holders). Private companies, regardless of size, have no Section 16 obligation because their equity is not registered under Section 12. When a private company completes an IPO, its insiders immediately become Section 16 reporters as of the effective date of the registration statement, and Form 3 filings are due within ten days.
Form Structure
A Form 4 filing on EDGAR is an XML document. Each filing covers exactly one reporting person and one issuer, but may contain multiple individual transactions reported in two tables: Table I for non-derivative securities (common stock, preferred stock, warrants held directly as securities) and Table II for derivative securities (options, restricted stock units, convertible instruments). The key data fields:
- Issuer CIK. The SEC's Central Index Key for the company, a persistent numeric identifier assigned at registration. The issuer CIK is the join key linking Form 4 data to all other EDGAR filings by the company: annual reports, proxy statements, 8-K current events filings, and Form 13F institutional holdings.
- Reporting person CIK and name. The SEC assigns a separate CIK to the reporting person as a distinct EDGAR filer. CIK-based person tracking is essential for longitudinal analysis; name strings in Form 4 filings appear in multiple variants across filings and issuers (middle initial present or absent, name order differences, hyphenated surnames inconsistently applied), while the reporting person CIK is stable across career changes and company changes.
- Relationship to issuer. Checkboxes for officer, director, and ten-percent owner, plus a free-text title field. The title field (“Chief Executive Officer,” “Executive Vice President and General Counsel,” “Lead Independent Director”) is unstandardized across filings and requires normalization for systematic role-based analysis.
- Transaction date. The date the transaction was executed — the trade date for market transactions, the award date for equity grants, the exercise date for options. Not the settlement date and not the EDGAR filing date. All event-study analyses of insider trading use the transaction date as the event date.
- Security title. The class of security transacted: “Common Stock,” “Class A Common Stock,” or, for derivative tables, the name of the derivative instrument (“Stock Option (Right to Buy)”).
- Transaction code. A single letter identifying the nature of the transaction. The most analytically important field on the form; discussed in detail in the next section.
- Shares transacted. For non-derivative securities, a share count. For derivative securities, the number of derivative units, with the underlying share count in a separate field. Where a transaction involves a price range (large block purchases executed across multiple days at different prices), the SEC permits reporting as a weighted-average price with a footnote disclosing the range.
- Price per share. The transaction price for open-market purchases and sales. For grants (code A), typically the grant-date fair value or $0. For option exercises (code M), the exercise price, not the market price. For gifts (code G) and tax withholding (code F), typically $0 or blank.
- Direct or indirect ownership. “D” means the insider holds the shares directly in their own name or account. “I” means indirect beneficial ownership through a trust, limited liability company, family partnership, spouse's account, or other vehicle, with a required footnote explaining the arrangement. Direct purchases are the cleanest signal; indirect purchases through investment vehicles introduce ambiguity about the insider's personal economic commitment.
- Shares owned following transaction. The reporting person's cumulative position in that security class after the reported transaction, stated separately for direct and indirect holdings. This running balance permits reconstruction of the insider's full position history from Form 4 data alone, without reference to the original Form 3 opening position.
Transaction Codes
The transaction code field is the most analytically consequential field on Form 4. The SEC defines a taxonomy of single-letter codes for non-derivative transactions. Understanding what each code represents — and which codes carry genuine informational content — is the foundation of any insider trading analysis.
| Code | Description | Typical use case | Signal value |
|---|---|---|---|
| P | Open-market or private purchase | Insider buys shares at market price with personal funds | High — discretionary, costly commitment |
| S | Open-market or private sale | Insider sells shares at market price | Low — contaminated by diversification, 10b5-1 plans |
| A | Grant or award by issuer | Restricted stock, RSU, or performance share award | None — no insider discretion |
| D | Disposition to issuer | Shares withheld to cover RSU tax at vesting (net settlement) | None — mechanically generated by comp plan |
| M | Exercise of derivative security | Stock option exercise converting to common shares | Context-dependent — see exercise-and-hold vs. cashless |
| F | Payment of exercise price or tax withholding | Shares withheld to pay option exercise cost or income tax | None — mechanically generated |
| G | Bona fide gift | Shares donated to charity or transferred to family member | None — no directional signal |
| J | Other acquisition or disposition | Inheritance, divorce settlement, reclassification | None — excluded from signal analysis |
| C | Conversion of derivative security | Convertible note converts to common shares | Low — typically contractually triggered |
| X | Exercise of in-the-money derivative (expired) | Exercise of derivative near or at expiration | Low — often forced by expiration date |
The practical implication of this taxonomy is that filtering the raw Form 4 universe to code P alone eliminates roughly 85–90 percent of filings by count but preserves the only transactions where the insider exercised personal discretion and committed personal capital. Equity grants (A), RSU tax withholdings (D and F), and option exercises (M) together dominate Form 4 filing volume by count — they are the mechanical outputs of standard executive compensation programs — and carry no discretionary signal. Every serious insider buying screen begins with the P-code filter applied to direct-ownership transactions.
Rule 10b5-1 Trading Plans
SEC Rule 10b5-1, adopted in 2000, created an affirmative defense against insider trading liability under Rule 10b-5 for trades executed pursuant to a pre-established written trading plan. An insider who adopts a 10b5-1 plan while not in possession of material non-public information — specifying in advance the amount, price, and timing of future transactions, or a formula for determining those parameters — can execute those trades even if MNPI subsequently comes into the insider's possession. The rationale is that the trading decision predated the informational advantage; execution is mechanical, not discretionary.
Academic research identified a structural flaw in the original rule. Studies by Alan Jagolinzer (2009) in The Accounting Review and by Lauren Cohen, Christopher Malloy, and Lukasz Pomorski (2012) in the Journal of Finance documented that insiders were gaming 10b5-1 plans: entering plans shortly before expected positive earnings announcements, exploiting the minimal cooling-off period that existed at the time, and executing trades within weeks of plan adoption. Insiders trading under 10b5-1 plans outperformed insiders trading outside plans by a statistically significant margin — the opposite of what the “pre-planned, non-informational” rationale would predict.
The SEC addressed this pattern in December 2022 with Rule 10b5-1 amendments that took effect in 2023. The amended rule imposed several significant changes. Officers and directors must now observe a mandatory cooling-off period of 90 days after plan adoption (or the date of the next quarterly earnings release, whichever is later, up to a maximum of 120 days) before the first trade can execute. Insiders are limited to one single-trade plan per 12-month period. Plans may not be adopted or modified while the insider is aware of MNPI. Most significantly for data analysts, the amended rule requires disclosure of 10b5-1 plan adoptions, modifications, and terminations in the footnotes of Form 4 filings and on quarterly reports, creating for the first time a structured (if still footnote-embedded) trail of plan activity.
For signal-building purposes, the 2023 amendments reduce but do not eliminate the contamination of 10b5-1 plan trades. The conservative analytical posture is to identify plan transactions through footnote pattern matching and weight them separately from non-plan purchases — or exclude them entirely from a clean signal focused on unambiguously discretionary buying.
Research Applications
The predictive content of insider transactions has been studied for more than four decades, beginning with work by Jeffrey Jaffe (1974) and H. Nejat Seyhun (1986). The consensus findings are robust across methodological approaches, sample periods, and international markets, and they cluster around a small number of durable conclusions.
Open-market purchases predict positive abnormal returns. Seyhun's 1992 study in the Journal of Business documented that aggregate insider buying across all public companies predicts broad market returns, with insiders collectively buying more before market upswings and selling more before downturns. At the individual stock level, P-coded purchases predict positive excess returns of roughly 3 to 6 percent over the six months following the transaction, with the effect strongest for small-cap companies where information asymmetry between insiders and public investors is greatest. The signal decays in large-cap stocks, where deep analyst coverage and continuous information production shrink the gap between what insiders know and what the market knows.
Cluster buying substantially amplifies the signal. When three or more distinct insiders at the same company purchase shares within a 30-day window, the predictive power of the combined signal exceeds the sum of the individual signals. Jeng, Metrick, and Zeckhauser (2003) in the Journal of Finance found that insider purchases earn approximately 6 percent abnormal returns over six months; cluster buying events at small-cap companies have shown even stronger results in subsequent literature. The intuition is that multiple insiders independently deciding to commit personal capital is harder to explain by noise or idiosyncratic factors than a single insider's purchase.
Opportunistic trading beats routine trading. Cohen, Malloy, and Pomorski (2012) decomposed the insider universe into “routine” traders (insiders who buy or sell in the same calendar month year after year, consistent with mechanical behavior such as automatic dividend reinvestment or scheduled 10b5-1 programs) and “opportunistic” traders (insiders with non-routine transaction timing). Opportunistic insider purchases predict 12-month abnormal returns exceeding 8 percent; routine purchases carry no statistically significant predictive power. Separating the two categories requires at least three years of transaction history per insider to establish a baseline.
CEO versus director patterns differ. Academic literature and practitioner experience both suggest that CEO purchases carry stronger signal than director purchases. The CEO has the deepest informational advantage — awareness of order trends, pipeline health, and pending announcements — while independent directors may have material information only on a subset of strategic decisions. Director purchases are more often explained by portfolio rebalancing, estate planning signals, or public relations considerations (a director buying shares to demonstrate confidence after a stock decline may be acting on reputational grounds rather than private information). Filtering to C-suite officers alone — CEO, CFO, COO — produces a tighter signal than including all Section 16 reporters.
| Insider | Company | Period | Approx. value | Context |
|---|---|---|---|---|
| Warren Buffett | Berkshire Hathaway (BRK) | Ongoing share repurchases (reported via 13F / buyback) | $70B+ cumulative | Sustained buyback program treated as insider signal by academics |
| Elon Musk | Tesla (TSLA) | 2022 | ~$2B | Purchase following large option-exercise sales; restored insider ownership |
| Mark Zuckerberg | Meta Platforms (META) | 2022 — 2023 | ~$1.4B (via RSU grants held) | Net ownership increases through restricted stock award retention |
| Jamie Dimon | JPMorgan Chase (JPM) | Feb 2024 | ~$150M | First open-market purchase in years; widely cited as bullish signal |
| Jeff Bezos | Amazon (AMZN) | 2023 sales | ~$8.5B | Large S-coded sales via 10b5-1 plan; illustrative of sale-signal weakness |
High-Profile Cases
Elon Musk and Twitter acquisition filing delays. The SEC opened an investigation into whether Musk violated the two-business-day Form 4 filing requirement during his acquisition of Twitter (now X) in 2022. Musk began purchasing Twitter shares in late January 2022 and crossed the five-percent threshold that triggers a separate Schedule 13G beneficial ownership disclosure on or around March 14, 2022 — but did not file that disclosure until April 4, more than ten days late. During the intervening period, he continued purchasing shares, ultimately accumulating a 9.2 percent stake, without public disclosure. The SEC sent him a subpoena in May 2022. The episode illustrated the two-tier disclosure structure for large shareholders: Form 4 governs transactions after crossing the ten-percent threshold, while Schedule 13D and 13G govern initial acquisition of large positions at lower thresholds. Musk's purchases occurred before the Section 16 threshold, but the late 13G filing was itself a disclosure violation that the SEC pursued.
Congressional stock trading and the STOCK Act. Members of Congress and senior congressional staff have access to information about pending legislation, regulatory investigations, and government contract decisions that can be material to public companies. Research by Ziobrowski et al. (2004, 2011) documented that senators and House members earned abnormal returns on their equity portfolios in the years before mandatory disclosure, a finding that contributed to the legislative push for the Stop Trading on Congressional Knowledge Act of 2012. The STOCK Act created a Form 4 analog for members of Congress and covered staff: the “Periodic Transaction Report,” which must be filed within 45 days of a covered transaction in excess of $1,000. The STOCK Act dataset is published by Congress on House and Senate disclosure websites and covers hundreds of members. Compliance has been inconsistent; multiple members of Congress have paid nominal fines for late filings without substantive enforcement of the underlying prohibition on trading on MNPI. A separate article on this site covers the STOCK Act and congressional trading data in detail.
Spring-loading and bullet-dodging. Spring-loading refers to the practice of timing option grants to precede positive announcements, giving recipients options priced below the post-announcement market value. Bullet-dodging is the inverse: accelerating option grants to precede negative announcements so they are issued at a higher exercise price. Both practices are detectable in Form 4 data by examining the option grant dates (A-coded transactions on derivative tables) relative to subsequent earnings releases or 8-K filings reporting material events. Several major option backdating scandals of the mid-2000s — including cases at Broadcom, Vitesse Semiconductor, and Comverse Technology — were first identified by academic researchers analyzing the statistical improbability of option grant dates systematically falling at stock price troughs.
EDGAR Data Access
All Form 4 filings are publicly available on EDGAR through several access paths:
- EDGAR full-text search. The EDGAR full-text search system at
https://efts.sec.gov/LATEST/search-indexsupports text searches across all EDGAR filings with form type filtering. Useful for finding filings mentioning a specific company name, or for searching across footnote text for phrases such as “10b5-1 trading plan” or “Rule 10b5-1.” Rate-limited to approximately ten requests per second per IP. - EDGAR submissions API. The endpoint
https://data.sec.gov/submissions/CIK{cik:010d}.jsonreturns a JSON document listing all recent filings for a given company or person CIK, including Form 4s with accession numbers, filing dates, and document lists. This is the preferred programmatic path for pulling all Form 4 filings for a specific issuer or reporting person without parsing quarterly bulk index files. - Quarterly bulk index files. The most efficient path for large-scale historical analysis. The SEC publishes quarterly index files at
https://www.sec.gov/Archives/edgar/full-index/, organized by year and quarter. Theform.idxfixed-width text file in each directory lists every EDGAR filing for that quarter with form type, company name, CIK, date filed, and the relative path to the filing. Filtering on form type “4” yields all Form 4 filings; the path field points to the XML document on the archive server. - EDGAR company or person search. Navigating to
https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&type=4with a CIK or name query returns a paginated list of Form 4 filings. Useful for manual lookup of a specific executive's transaction history; impractical for bulk analysis.
The SEC's fair access policy requires automated requests to include a descriptive User-Agent header containing a name, organization, and contact email address, and to limit request rates to no more than ten requests per second. Bulk download scripts that ignore rate limiting risk temporary IP blocks affecting all users on a shared network. The SEC has increasingly enforced this policy and has blocked IP ranges belonging to cloud providers used for large-scale scraping operations.
Secondary aggregators have built structured databases on top of the EDGAR primary source. OpenInsider provides a filtered interface focused on open-market buys and sells, with filters for transaction size, insider type, company, sector, and date range, and updates within hours of new EDGAR filings. Quiver Quantitative and other vendors provide programmatic API access to cleaned and structured insider transaction data with computed fields such as rolling cluster counts and percent of shares outstanding. Both categories of secondary source are derivatives of EDGAR; the primary source is the canonical reference for any regulatory or compliance use.
Python: Downloading and Parsing Form 4 Filings from EDGAR
The script below demonstrates the full workflow for programmatic Form 4 analysis: resolving a ticker symbol to an EDGAR CIK, enumerating Form 4 accession numbers from the EDGAR submissions API, fetching and parsing the underlying XML for each filing, and building a monthly insider net-buying time series. The approach works for any listed company and scales to the full EDGAR historical record by iterating across issuers and date ranges.
import requests
import xml.etree.ElementTree as ET
from collections import defaultdict
import datetime
# ---------------------------------------------------------------------------
# SEC EDGAR Form 4 Insider Transaction Screen
# Sources:
# Bulk index: https://www.sec.gov/Archives/edgar/full-index/
# EDGAR EFTS: https://efts.sec.gov/LATEST/search-index?q=%22form+4%22&dateRange=custom
# Submissions: https://data.sec.gov/submissions/CIK{cik:010d}.json
#
# Strategy:
# 1. Resolve a ticker to an issuer CIK using the EDGAR company search API.
# 2. Pull the issuer submission history to enumerate Form 4 accession numbers.
# 3. Fetch each Form 4 XML and parse non-derivative transactions.
# 4. Aggregate net buying/selling by month to build a time series.
# ---------------------------------------------------------------------------
HEADERS = {"User-Agent": "research@example.com (insider-screen research project)"}
BASE = "https://www.sec.gov"
DATA_API = "https://data.sec.gov"
# ── 1. Resolve ticker → CIK ─────────────────────────────────────────────────
def ticker_to_cik(ticker: str) -> str:
"""Return the zero-padded 10-digit CIK for a given exchange ticker."""
url = BASE + "/cgi-bin/browse-edgar?action=getcompany&company=&CIK=" + ticker
url += "&type=4&dateb=&owner=include&count=1&search_text=&output=atom"
resp = requests.get(url, headers=HEADERS, timeout=30)
resp.raise_for_status()
# The Atom feed embeds the CIK in the company-info element.
import re
m = re.search(r"CIK=(d+)", resp.text)
if not m:
raise ValueError("CIK not found for ticker: " + ticker)
return m.group(1).zfill(10)
# ── 2. Enumerate Form 4 accession numbers from submissions API ───────────────
def get_form4_accessions(cik10: str, max_filings: int = 200) -> list[dict]:
"""Return a list of recent Form 4 submission records for an issuer CIK."""
url = DATA_API + "/submissions/CIK" + cik10 + ".json"
data = requests.get(url, headers=HEADERS, timeout=30).json()
filings = data.get("filings", {}).get("recent", {})
form_types = filings.get("form", [])
acc_numbers = filings.get("accessionNumber", [])
filed_dates = filings.get("filingDate", [])
results = []
for form_type, acc, filed in zip(form_types, acc_numbers, filed_dates):
if form_type != "4":
continue
results.append({"accession": acc, "filed": filed})
if len(results) >= max_filings:
break
return results
# ── 3. Fetch and parse a single Form 4 XML ───────────────────────────────────
def fetch_form4_xml(accession: str) -> str:
"""Download the primary Form 4 XML document for an accession number."""
acc_path = accession.replace("-", "")
# The primary document is always named *.xml; index lists its name.
index_url = (
BASE + "/Archives/edgar/" + acc_path[:10] + "/"
+ acc_path[10:12] + "/" + acc_path[12:] + "/" + accession + "-index.htm"
)
# Construct the XML path directly: accession folder + the 4-digit form doc.
folder = BASE + "/Archives/edgar/" + acc_path[:10] + "/" + acc_path[10:12] + "/" + acc_path[12:] + "/"
index_resp = requests.get(folder + accession + "-index.json", headers=HEADERS, timeout=20)
if index_resp.status_code == 200:
idx = index_resp.json()
for doc in idx.get("documents", []):
if doc.get("type") == "4" or doc.get("name", "").endswith(".xml"):
return requests.get(folder + doc["name"], headers=HEADERS, timeout=20).text
raise ValueError("XML document not found for: " + accession)
def parse_transactions(xml_text: str) -> list[dict]:
"""Parse non-derivative transactions from a Form 4 XML string."""
try:
root = ET.fromstring(xml_text)
except ET.ParseError:
return []
issuer_cik = (root.findtext(".//issuerCik") or "").strip()
issuer_name = (root.findtext(".//issuerName") or "").strip()
filer_name = (root.findtext(".//rptOwnerName") or "").strip()
is_officer = (root.findtext(".//isOfficer") or "0").strip() == "1"
is_director = (root.findtext(".//isDirector") or "0").strip() == "1"
is_10pct = (root.findtext(".//isTenPercentOwner") or "0").strip() == "1"
txns = []
for txn in root.findall(".//nonDerivativeTransaction"):
code = (txn.findtext(".//transactionCode") or "").strip()
date_val = (txn.findtext(".//transactionDate/value") or "").strip()
shares_val = (txn.findtext(".//transactionShares/value") or "0").strip()
price_val = (txn.findtext(".//transactionPricePerShare/value") or "0").strip()
direct_val = (txn.findtext(".//directOrIndirectOwnership/value") or "").strip()
post_val = (txn.findtext(".//sharesOwnedFollowingTransaction/value") or "0").strip()
try:
shares = float(shares_val)
price = float(price_val)
except ValueError:
continue
signed_shares = shares if code == "P" else -shares if code == "S" else 0.0
txns.append({
"issuer_cik": issuer_cik,
"issuer_name": issuer_name,
"filer_name": filer_name,
"is_officer": is_officer,
"is_director": is_director,
"is_10pct": is_10pct,
"code": code,
"date": date_val,
"shares": shares,
"signed_shares": signed_shares,
"price": price,
"value_usd": shares * price,
"direct": direct_val == "D",
"post_shares": float(post_val) if post_val else 0.0,
})
return txns
# ── 4. Build a monthly insider net-buying time series ────────────────────────
def monthly_net_buying(transactions: list[dict]) -> dict[str, float]:
"""Aggregate signed share counts by YYYY-MM bucket (P positive, S negative)."""
monthly: dict[str, float] = defaultdict(float)
for t in transactions:
if t["code"] not in ("P", "S"):
continue
if not t["direct"]:
continue # skip indirect holdings for cleaner signal
month = t["date"][:7] # e.g. "2024-11"
monthly[month] += t["signed_shares"]
return dict(sorted(monthly.items()))
# ── Main ─────────────────────────────────────────────────────────────────────
TICKER = "AAPL" # Apple Inc. — change to any listed ticker
print("Resolving CIK for " + TICKER + " ...")
cik10 = ticker_to_cik(TICKER)
print(" CIK: " + cik10)
print("Fetching Form 4 accession list ...")
accessions = get_form4_accessions(cik10, max_filings=100)
print(" Found " + str(len(accessions)) + " Form 4 filings")
all_txns: list[dict] = []
for rec in accessions[:50]: # limit to 50 for demo; remove slice for full history
try:
xml = fetch_form4_xml(rec["accession"])
txns = parse_transactions(xml)
for t in txns:
t["filed"] = rec["filed"]
all_txns.extend(txns)
except Exception as e:
print(" skip " + rec["accession"] + " -- " + str(e))
print("\nTotal non-derivative transactions parsed: " + str(len(all_txns)))
# Filter to open-market buys and sells only
market_txns = [t for t in all_txns if t["code"] in ("P", "S") and t["direct"]]
buys = [t for t in market_txns if t["code"] == "P"]
sells = [t for t in market_txns if t["code"] == "S"]
print("Open-market purchases: " + str(len(buys))
+ " | Open-market sales: " + str(len(sells)))
# Largest single purchases
buys.sort(key=lambda x: x["value_usd"], reverse=True)
print("\nTop open-market purchases:")
for t in buys[:5]:
name = t["filer_name"][:30].ljust(32)
val = str(int(round(t["value_usd"]))).rjust(14)
print(" " + name + " $" + val + " " + t["date"])
# Monthly net buying time series
monthly = monthly_net_buying(all_txns)
print("\nMonthly net insider buying (shares, direct P minus S):")
for month, net in list(monthly.items())[-12:]:
bar = "+" * int(abs(net) / 1000) if net > 0 else "-" * int(abs(net) / 1000)
sign = "+" if net >= 0 else ""
print(" " + month + " " + sign + str(int(net)).rjust(10) + " " + bar[:40])
Several implementation notes apply. Ticker-to-CIK resolution via the EDGAR browse-edgar CGI endpoint is the most reliable approach; the EDGAR company tickers JSON at https://www.sec.gov/files/company_tickers.json provides a bulk lookup table that is faster for large batches. The submissions API returns filings in reverse chronological order and paginates older filings into a separate “files” array that requires additional requests to enumerate. For the monthly net-buying time series, the signed share count — positive for P-coded purchases, negative for S-coded sales — provides a simple directional aggregate; dividing by shares outstanding at the period end normalizes for company size. For cluster detection, group the parsed transactions by issuer CIK and transaction date, then count distinct reporting-person CIKs within rolling 30-day windows.
Limitations
Reporting delays and late filings. The two-business-day deadline is a legal requirement, not a universal practice. Late Form 4 filings are common, particularly among smaller companies and for transactions involving complex ownership structures. The SEC publishes delinquency data and sends delinquency notices, but does not pursue enforcement against late filers as a routine matter. Late filings cluster among companies with less sophisticated compliance infrastructure; the Form 4 timestamp on EDGAR reflects the actual filing date, and the transaction date embedded in the XML is self-reported. Any system that monitors the real-time EDGAR feed will see delayed disclosure of some transactions by days or weeks.
Derivative complexity. The derivative securities table in Form 4 — Table II — covers options, RSUs, performance units, convertible instruments, and other equity derivatives, each with distinct economic characteristics. An option exercise followed immediately by a same-day market sale (a cashless “exercise and sell”) is fundamentally different from an option exercise followed by a multi-year hold. Parsing the combination requires joining M-coded derivative table exercises to S-coded non-derivative table sales by filer CIK, issuer CIK, and transaction date. RSU vesting (A-coded award) paired with tax withholding (D or F-coded disposition) on the same date similarly requires transaction-level pairing to understand the net economic effect. The raw XML provides the data; the analytical logic to interpret it correctly is non-trivial.
10b5-1 plan opacity before 2023. Before the 2023 SEC rule amendments, there was no structured field on Form 4 to indicate whether a transaction was executed pursuant to a 10b5-1 plan. The only disclosure was an optional footnote, and the language of those footnotes varied substantially across filers and law firms. Systems that attempted to identify 10b5-1 plan transactions through footnote text matching had substantial false positive and false negative rates. The 2023 amendments created a checkbox on the updated Form 4 for plan transactions, improving structured identification, but the historical corpus pre-2023 remains opaque on this dimension.
Ownership vehicle complexity. Indirect holdings reported with an “I” in the direct/indirect ownership field may represent anything from a revocable living trust that the insider controls completely to a discretionary family partnership managed by a third party. The footnote explanations are required but are often minimal. Treating all indirect holdings as equivalent in a signal model overstates the insider's economic commitment in some cases and understates it in others. The safest analytical posture for discretionary signal purposes is to restrict the universe to direct-ownership P-coded transactions.
For federal court records covering insider trading prosecutions — criminal dockets, plea agreements, and sentencing documents in SEC-referred cases — see PACER Federal Courts: The Public Access System Behind Every Federal Case.
FEC enforcement matters involving political finance disclosure violations share structural parallels with Section 16 insider reporting obligations — both are mandatory disclosure regimes with civil enforcement for late or incomplete filings. See FEC Enforcement: Matter Under Review (MUR) Database.
IRS Criminal Investigation pursues financial crime cases that frequently begin with Form 4 and 13D disclosure anomalies that reveal unreported income or fraudulent schemes. See IRS Criminal Investigation: The Federal Database Behind Tax Fraud and Financial Crime Prosecutions.