Skip to main content
BLOG

How to Verify NPI Numbers in Bulk

Stale NPIs wreck your outreach and your credibility. Here is how to validate thousands of NPI numbers against CMS data without doing it one at a time.

2026-03-29

NPI verification data quality CMS API bulk validation

Why Bulk NPI Verification Matters

Every healthcare provider contact list is built on NPI numbers. When those NPIs are wrong, deactivated, or mismatched to the wrong provider, everything downstream breaks. Your emails go to the wrong person. Your segmentation is off. Your compliance records are inaccurate. And your sales team wastes time chasing contacts who do not exist at the practice your data says they do.

The CMS NPI Registry contains over 7 million NPI records. Roughly 300,000-400,000 records change in some way every year: address updates, taxonomy changes, deactivations, and reactivations. If you are working with a provider list that has not been validated against current CMS data, you are guaranteed to have errors.

Manual verification (typing NPIs into the CMS lookup tool one at a time) works for 10 records. It does not work for 10,000. This guide covers the methods, tools, and common pitfalls for verifying NPI numbers at scale.

Method 1: The NPPES API

CMS provides a free API for querying the NPI registry. This is the most reliable method for bulk verification because you are checking directly against the authoritative source.

How the API Works

The NPPES API endpoint is https://npiregistry.cms.hhs.gov/api/. You can query by NPI number, provider name, taxonomy code, state, or a combination of fields. The API returns JSON with the full provider record including name, address, taxonomy, enumeration date, and deactivation status.

Key parameters for bulk verification:

  • number: The 10-digit NPI number you want to verify
  • version: Set to "2.1" for the current API version
  • limit: Number of results per query (max 200)

A single query looks like: GET /api/?number=1234567890&version=2.1

The API returns a result_count field. If it is 0, the NPI does not exist in the registry. If it returns a record, check the deactivation fields to confirm the NPI is still active.

Rate Limits and Practical Throughput

CMS does not publish official rate limits for the NPPES API, but the practical limit is roughly 2-3 requests per second before you start getting throttled or receiving timeout errors. At that rate, verifying 10,000 NPIs takes approximately 60-90 minutes.

Tips for maximizing throughput:

  • Batch your requests with a consistent delay (400-500ms between calls)
  • Implement retry logic with exponential backoff for failed requests
  • Run verification during off-peak hours (evenings and weekends) when the API is less loaded
  • Cache results locally so you do not re-verify NPIs you already checked recently

Method 2: NPPES Bulk Data Download

For large-scale verification (50,000+ records), the API approach is too slow. Instead, download the full NPPES data dissemination file and run your verification locally.

What You Get

CMS publishes two files: a full replacement file (updated monthly, ~8GB compressed) and weekly incremental update files. The full file contains every NPI record ever created, including deactivated ones. The weekly files contain only records that changed since the last update.

For initial verification, download the full file. For ongoing maintenance, process the weekly updates.

Setting Up Local Verification

The NPPES file is a pipe-delimited CSV with over 300 columns. For verification purposes, you only need a subset:

  • NPI: The 10-digit identifier
  • Entity Type Code: 1 (individual) or 2 (organization)
  • Provider First Name / Last Name: For name matching
  • Provider Business Practice Location Address: Current practice address
  • Healthcare Provider Taxonomy Code: Specialty classification
  • NPI Deactivation Date: If populated, the NPI is deactivated
  • NPI Reactivation Date: If populated after a deactivation, the NPI was reactivated

Load the relevant columns into a database (PostgreSQL, SQLite, or even a pandas DataFrame for smaller datasets). Then join your provider list against the NPPES data on NPI number. Any NPI in your list that does not match, or matches a deactivated record, gets flagged.

Method 3: NPI Checksum Validation

Before you even hit the CMS API or database, you can catch obviously invalid NPIs using the Luhn algorithm. NPI numbers use a modified version of the Luhn check digit formula (the same algorithm used to validate credit card numbers).

The process:

  1. Prefix the 10-digit NPI with "80840" (the CMS prefix)
  2. Apply the Luhn algorithm to the resulting 15-digit number
  3. If the check digit is valid, the NPI is structurally correct
  4. If it fails, the NPI is definitely invalid (no need to look it up)

Checksum validation catches typos, transposed digits, and completely fabricated NPIs. It does not confirm that the NPI is active or that it belongs to the provider you think it does. Use it as a fast pre-filter before running API or database verification.

In practice, running Luhn validation first catches 2-5% of NPIs in a typical vendor-sourced list. Those are records that should never have been in the data in the first place.

What to Check Beyond "Does This NPI Exist?"

Existence is the minimum bar. A thorough NPI verification checks several additional fields:

Deactivation Status

An NPI can be deactivated for several reasons: the provider retired, lost their license, died, or the NPI was issued in error. The NPPES record includes both deactivation and reactivation dates. Check that there is no deactivation date, or that any deactivation was followed by a reactivation.

Name Match

Verify that the name in your data matches the name on the NPI record. Mismatches happen when data vendors assign the wrong NPI to a provider (especially common with common names like "John Smith" or "David Lee"). Use fuzzy matching to account for middle initials, suffixes, and name variations, but flag exact mismatches for manual review.

Taxonomy Match

Confirm that the taxonomy code on the NPI record aligns with the specialty in your data. If your list says a provider is a dermatologist but their NPI taxonomy is internal medicine, something is wrong. Providers can have multiple taxonomy codes, so check all of them, but flag records where none of the taxonomy codes match your expected specialty.

Address Currency

Compare the practice address in your data to the practice address on the NPI record. Address mismatches often indicate that the provider has moved to a new practice but your data has not been updated. Note that NPI addresses are self-reported and may lag actual moves by months or years, so an address mismatch does not automatically mean your data is wrong. But it does mean you should flag the record for additional verification.

Common NPI Data Errors and How to Handle Them

After verifying millions of NPI records, these are the errors that come up most often:

  • Transposed digits: Two adjacent digits are swapped. Luhn validation catches most of these. When the checksum passes but the NPI returns the wrong provider, check for single-digit transpositions.
  • Deactivated NPIs still in use: Vendors often do not remove deactivated NPIs from their databases. Roughly 1-2% of records in a typical commercial provider list have deactivated NPIs.
  • Type 1 / Type 2 confusion: An organizational NPI (Type 2) is listed where an individual NPI (Type 1) should be, or vice versa. This happens when the practice NPI is used instead of the physician's individual NPI.
  • Duplicate NPIs for the same provider: Some providers have multiple active NPIs, usually because they registered a new one when changing practices instead of updating the existing one. CMS tries to catch these, but some slip through.
  • Stale taxonomy codes: A provider changed their practice focus but never updated their taxonomy code with CMS. The NPI is valid, but the specialty data is outdated.

Building an Automated Verification Pipeline

For teams that maintain ongoing provider lists, manual verification runs do not scale. Here is a practical pipeline architecture:

  1. Weekly NPPES delta download: Automatically download the weekly NPPES update file every Monday. Parse the changes and update your local reference database.
  2. Nightly Luhn pre-check: Run checksum validation on any new records added to your system during the day. Flag failures immediately.
  3. Rolling API verification: Verify a batch of records against the NPPES API daily, cycling through your full database over 30 days. This catches deactivations and changes that the weekly file might miss due to timing.
  4. Quarterly full refresh: Download the complete NPPES file quarterly and run a full comparison against your database. This catches any records that slipped through the incremental processes.
  5. Change alerts: When a verified NPI record changes (address, taxonomy, deactivation), trigger an alert to the account owner or data steward so they can update downstream systems.

This pipeline catches 95%+ of NPI data issues within days of the change occurring. The remaining edge cases (providers who never update their NPI records) require additional verification through state licensing boards and practice website checks.

When to Skip DIY and Use a Verified Data Source

Building and maintaining an NPI verification pipeline is worthwhile if your organization manages tens of thousands of provider records and has engineering resources to maintain the infrastructure. If you are a sales team that needs a verified provider list for a campaign, the time and effort of building this pipeline from scratch does not make sense.

Provyx runs NPI verification as part of every data delivery. Every record is checked against current CMS data, matched on name and taxonomy, and flagged if anything does not align. You get clean data without building the verification infrastructure yourself.

For teams that want to verify their existing data, send us a sample and we will run it through our verification pipeline and show you what your current error rate looks like. Most teams are surprised by the results.

About the Author

Rome

Former Datajoy (acquired by Databricks), Microsoft, Salesforce. UC Berkeley Haas MBA.

LinkedIn Profile

Frequently Asked Questions

How long does it take to verify 10,000 NPI numbers?

Using the NPPES API at a practical rate of 2-3 requests per second, verifying 10,000 NPIs takes 60-90 minutes. Using the bulk NPPES download file and local database matching, the same volume can be processed in under 5 minutes once the data is loaded. For very large datasets (100,000+), the bulk download method is strongly recommended.

What percentage of NPI records in a typical list are invalid?

In vendor-sourced provider lists, we typically find 2-5% of NPIs fail Luhn checksum validation (structurally invalid), 1-2% are deactivated, and another 3-5% have name or taxonomy mismatches. Combined, 6-12% of records in a typical list have some form of NPI data issue.

Does the NPI Registry API have rate limits?

CMS does not publish official rate limits, but the practical ceiling is 2-3 requests per second before experiencing throttling or timeouts. For bulk verification, downloading the full NPPES data file and running checks locally is faster and more reliable than the API.

Can an NPI number be reactivated after deactivation?

Yes. Deactivated NPIs can be reactivated by the provider through CMS. The NPPES record will show both a deactivation date and a reactivation date. When verifying NPIs, check that the most recent status-change date indicates an active status, not just the presence or absence of a deactivation date.

Get the Provider Data You Need

Tell us what you're looking for. We'll build a custom list matched to your target market.

Get Provider Data

Trusted by healthcare sales teams, medical device companies, and health IT vendors across the US.