Google Search Console API with Python: A Hands-On Tutorial

Q: How do I get more than 25,000 rows from the Search Analytics API?

A single `searchanalytics().query()` request returns at most 25,000 rows. To get more, paginate by incrementing the `startRow` field by 25,000 on each call and keep requesting until a page comes back with fewer rows than your `rowLimit`, which signals the end of the data. The pagination loop in this tutorial handles that automatically.

Q: What scope do I need to read Search Console data?

For reporting you only need the read-only scope `https://www.googleapis.com/auth/webmasters.readonly`. It grants access to Search Analytics queries and property metadata without allowing any modifications. If you later need to manage sitemaps or property settings, you would request the broader `https://www.googleapis.com/auth/webmasters` scope instead.

Google Search Console (GSC) shows you the queries, pages, countries, and devices that drive impressions and clicks to your site. The web UI is useful for spot checks, but it caps tables, throttles exports, and makes it painful to combine data across date ranges. The moment you want repeatable reporting, large keyword pulls, or automated content briefs, you need the API.

This tutorial walks through using the Google Search Console API with Python end to end. You'll enable the API in Google Cloud, choose between OAuth and a service account, install the official client library, build a service object, construct a Search Analytics query, paginate past the per-request row cap, and write the results to a CSV file and a pandas DataFrame. Every code block is copy-paste runnable once you've filled in your credentials and site URL.

We'll stick to the official google-api-python-client library and the searchconsole API at version v1, which exposes the Search Analytics endpoint. If you'd rather understand the API conceptually before writing code, the companion Google Search Console API guide covers the data model and quotas in more depth.

Step 1: Enable the Search Console API in Google Cloud

Every Google API call is tied to a Google Cloud project, even when the data itself lives in Search Console. Here's the one-time setup.

Go to the Google Cloud Console and create a new project (or pick an existing one).
Open APIs & Services -> Library, search for "Google Search Console API", and click Enable.
Open APIs & Services -> Credentials to create the credential you'll use for authentication (covered next).

The API is free and has generous quotas for typical reporting. You're requesting data for properties you already own or have been granted access to in Search Console, so there is no separate data subscription.

Step 2: Choose your authentication method

The Search Console API supports two practical auth flows. Picking the right one upfront saves a lot of friction.

OAuth desktop flow (interactive, for your own properties)

Use OAuth when you personally have access to the property in Search Console and you're running the script on a machine where you can open a browser once to consent. This is the best choice for ad-hoc analysis, local scripts, and notebooks.

In the Cloud Console, under Credentials -> Create Credentials -> OAuth client ID, choose application type Desktop app. Download the JSON file (commonly named client_secret.json). You'll also need to configure the OAuth consent screen; while your app is in "Testing" status, add your own Google account as a test user so consent succeeds.

Service account (non-interactive, for automation)

Use a service account when you want unattended automation: a cron job, a server, a CI pipeline, or a multi-site tool. A service account authenticates with a key file and never needs a browser.

The critical, easy-to-miss step: a service account is its own identity with its own email address (something like gsc-reader@your-project.iam.gserviceaccount.com). It has no access to your Search Console properties by default. You must add that service account email as a user on the property in Search Console: open the property, go to Settings -> Users and permissions -> Add user, paste the service account email, and grant at least Full (or Restricted) access. Without this step, every API call returns a 403 even though the credentials are valid.

Quick rule of thumb: reach for OAuth for interactive, personal use; reach for a service account for scheduled or shared automation. If juggling Cloud projects and key files sounds like more than you want to take on, the free Search Console Tools app lets you sign in with Google OAuth and pull this same data with no code or setup. More on that at the end.

Step 3: Install the client library

Create a virtual environment and install the official Google API client plus the auth helper libraries. We'll add pandas for the export step.

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install google-api-python-client google-auth google-auth-oauthlib pandas

google-api-python-client provides googleapiclient.discovery.build, which constructs the service object.
google-auth-oauthlib handles the interactive OAuth desktop flow.
google-auth handles service-account credentials.

Step 4: Build the service object

The service object is your handle to the API. You build it once and reuse it for every query. The pattern is identical regardless of auth method; only the credentials differ.

Option A: Build with OAuth credentials

This snippet runs the desktop flow the first time (opening a browser), then caches the token in token.json so subsequent runs are non-interactive until the token needs refreshing.

import os
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build

# Read-only scope is enough for Search Analytics reporting.
SCOPES = ["https://www.googleapis.com/auth/webmasters.readonly"]
CLIENT_SECRET_FILE = "client_secret.json"
TOKEN_FILE = "token.json"


def get_service_oauth():
    creds = None
    if os.path.exists(TOKEN_FILE):
        creds = Credentials.from_authorized_user_file(TOKEN_FILE, SCOPES)

    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                CLIENT_SECRET_FILE, SCOPES
            )
            creds = flow.run_local_server(port=0)
        with open(TOKEN_FILE, "w") as token:
            token.write(creds.to_json())

    return build("searchconsole", "v1", credentials=creds)

Option B: Build with a service account

If you went the service-account route, point the helper at your downloaded key file. Remember that the service account email must already be added as a user in Search Console (Step 2).

from google.oauth2 import service_account
from googleapiclient.discovery import build

SCOPES = ["https://www.googleapis.com/auth/webmasters.readonly"]
KEY_FILE = "service_account.json"


def get_service_service_account():
    creds = service_account.Credentials.from_service_account_file(
        KEY_FILE, scopes=SCOPES
    )
    return build("searchconsole", "v1", credentials=creds)

Either function returns the same kind of service object, so the rest of this tutorial works unchanged with whichever you pick.

Step 5: Construct a Search Analytics query

Search Analytics data is retrieved with service.searchanalytics().query(). You pass two things: the siteUrl of the property and a body dictionary describing what you want.

A few notes on siteUrl. For a URL-prefix property, it looks like https://example.com/. For a Domain property, it uses the format sc-domain:example.com. Use exactly the string Search Console shows for your property.

Here are the fields you'll use most in the request body:

startDate / endDate: ISO YYYY-MM-DD strings. GSC data lags by a couple of days, so pick an end date a few days in the past.
dimensions: a list such as ["query"], ["page"], or ["query", "page"]. Each row returned is one unique combination of the requested dimensions.
dimensionFilterGroups: optional filtering (see below).
rowLimit: rows per request. The maximum accepted value is 25000.
startRow: the zero-based offset for pagination. To get rows 25000-49999, set startRow to 25000.
type: the search type, such as "web", "image", "video", "news", or "discover". Defaults to web search.

Filtering with dimensionFilterGroups

dimensionFilterGroups is a list of groups; within each group you list filters, each with a dimension, an operator, and an expression. Supported operators include equals, notEquals, contains, notContains, and includingRegex / excludingRegex. The example below keeps only queries that contain "python".

def build_request_body(start_date, end_date, start_row=0):
    return {
        "startDate": start_date,
        "endDate": end_date,
        "dimensions": ["query", "page"],
        "dimensionFilterGroups": [
            {
                "filters": [
                    {
                        "dimension": "query",
                        "operator": "contains",
                        "expression": "python",
                    }
                ]
            }
        ],
        "type": "web",
        "rowLimit": 25000,
        "startRow": start_row,
    }

Regex filters are powerful for grouping intent or excluding branded terms. If you want to go deeper on patterns, see our guide to Search Console regex filters.

A single query call

Before paginating, it's worth seeing one raw call so you understand the response shape.

service = get_service_oauth()   # or get_service_service_account()
SITE_URL = "https://example.com/"

body = build_request_body("2026-04-01", "2026-04-30")
response = service.searchanalytics().query(siteUrl=SITE_URL, body=body).execute()

for row in response.get("rows", [])[:5]:
    query, page = row["keys"]
    print(query, page, row["clicks"], row["impressions"], row["ctr"], row["position"])

Each row has a keys list (in the same order as your dimensions) plus clicks, impressions, ctr, and position metrics. If rows is missing from the response, there was simply no data for your filters and date range.

Step 6: Paginate past the 25,000-row limit

A single request returns at most 25,000 rows. Large properties have far more query/page combinations than that, so you must paginate by incrementing startRow until the API returns a short (or empty) page. The loop below keeps requesting until a page comes back with fewer rows than rowLimit, which signals you've reached the end.

This is the heart of any serious GSC pull. Note that the API also tops out at roughly 50,000 rows per property/day for fresh data in some cases, and totals vary by property; the loop handles whatever the API is willing to return. (If you keep bumping into ceilings, the 1,000-row limit explainer clears up which caps apply where.)

import time
from googleapiclient.errors import HttpError

ROW_LIMIT = 25000


def fetch_all_rows(service, site_url, start_date, end_date, dimensions=None):
    """Page through Search Analytics results until the API runs out of rows."""
    dimensions = dimensions or ["query", "page"]
    all_rows = []
    start_row = 0

    while True:
        body = {
            "startDate": start_date,
            "endDate": end_date,
            "dimensions": dimensions,
            "type": "web",
            "rowLimit": ROW_LIMIT,
            "startRow": start_row,
        }

        try:
            response = service.searchanalytics().query(
                siteUrl=site_url, body=body
            ).execute()
        except HttpError as err:
            # 429/500-class errors: back off briefly and retry the same page.
            if err.resp.status in (429, 500, 503):
                time.sleep(5)
                continue
            raise

        rows = response.get("rows", [])
        if not rows:
            break

        all_rows.extend(rows)
        print(f"Fetched {len(rows)} rows (offset {start_row}), "
              f"running total {len(all_rows)}")

        # A short page means there is no more data after this one.
        if len(rows) < ROW_LIMIT:
            break

        start_row += ROW_LIMIT

    return all_rows

The retry block matters in practice. The API occasionally returns transient 429 (rate limit) or 5xx errors on big pulls; a short sleep and a retry of the same offset is usually enough. For very large or recurring jobs you may want exponential backoff and a hard retry cap, but the simple version above is fine for most reporting.

Step 7: Export to a pandas DataFrame and CSV

Raw API rows aren't analysis-ready: the dimension values are nested in a keys list. The function below flattens each row into a flat dict, with one column per dimension plus the four metrics, then loads everything into a DataFrame.

import pandas as pd


def rows_to_dataframe(rows, dimensions):
    records = []
    for row in rows:
        record = dict(zip(dimensions, row["keys"]))
        record["clicks"] = row.get("clicks", 0)
        record["impressions"] = row.get("impressions", 0)
        record["ctr"] = row.get("ctr", 0.0)
        record["position"] = row.get("position", 0.0)
        records.append(record)

    df = pd.DataFrame(records)
    if not df.empty:
        # CTR comes back as a 0-1 fraction; show it as a percentage.
        df["ctr"] = (df["ctr"] * 100).round(2)
        df["position"] = df["position"].round(1)
        df = df.sort_values("clicks", ascending=False).reset_index(drop=True)
    return df

Now tie it together into a runnable script.

if __name__ == "__main__":
    service = get_service_oauth()          # or get_service_service_account()
    SITE_URL = "https://example.com/"      # or "sc-domain:example.com"
    DIMENSIONS = ["query", "page"]

    rows = fetch_all_rows(
        service,
        SITE_URL,
        start_date="2026-04-01",
        end_date="2026-04-30",
        dimensions=DIMENSIONS,
    )

    df = rows_to_dataframe(rows, DIMENSIONS)
    print(df.head(20))

    df.to_csv("gsc_search_analytics.csv", index=False)
    print(f"Wrote {len(df)} rows to gsc_search_analytics.csv")

Run it with python gsc_pull.py. On the first OAuth run a browser window opens for consent; after that the cached token keeps it non-interactive. You'll get a gsc_search_analytics.csv with columns for query, page, clicks, impressions, ctr, and position.

If you need the data in a different shape for downstream tooling, exporting to JSON is a one-liner (df.to_json(...)); we cover the nuances in exporting Search Console data to JSON.

Step 8: Turn the data into content briefs

Once the data is in a DataFrame, content-brief logic is just filtering. A classic move is finding "striking distance" queries: terms ranking on page two (positions roughly 11-20) with meaningful impressions but low CTR. Those are pages that could win clicks with targeted on-page improvements.

striking_distance = df[
    (df["position"] >= 11)
    & (df["position"] <= 20)
    & (df["impressions"] >= 100)
].sort_values("impressions", ascending=False)

striking_distance.to_csv("content_opportunities.csv", index=False)
print(striking_distance.head(20))

Group by page to see which URLs attract many near-miss queries, and you have a ranked backlog of pages to optimize, each with the exact terms to target. That is the core idea behind a data-driven content brief.

Scaling up: when to graduate to BigQuery

The pagination approach works well up to a point, but if you're pulling many properties daily or need years of history, the per-request caps and quotas become a bottleneck. Google's Bulk Data Export streams complete, unsampled Search Console data into BigQuery, sidestepping the row limits entirely. For large-scale or historical analysis, see our walkthrough of the BigQuery bulk export.

A no-code shortcut

Writing and maintaining this script is worthwhile if you need full control, but it isn't the only path. Search Console Tools is a free, browser-based app: sign in with Google OAuth, pick a property, and it pulls your Search Analytics data and turns it into content briefs automatically. No Cloud project, no key files, no pagination code. It's the fastest way to get the same insights this script produces, and a good way to validate your queries before automating them. For a broader landscape, our roundup of the best Search Console tools for 2026 compares the options.

Frequently Asked Questions

What is the Google Search Console API used for?

The Search Console API lets you programmatically retrieve Search Analytics data (clicks, impressions, CTR, and average position) broken down by query, page, country, device, and date. It's used for automated SEO reporting, large keyword pulls beyond the UI's limits, dashboards, and building content briefs. You can only access properties you own or have been granted permission to in Search Console.

Should I use OAuth or a service account for the GSC API?

Use the OAuth desktop flow for interactive, personal use, where you can open a browser once to consent and you already have access to the property. Use a service account for unattended automation like cron jobs, servers, and CI pipelines. With a service account you must add its email address as a user on the property in Search Console, or every call returns a 403.

How do I get more than 25,000 rows from the Search Analytics API?

A single searchanalytics().query() request returns at most 25,000 rows. To get more, paginate by incrementing the startRow field by 25,000 on each call and keep requesting until a page comes back with fewer rows than your rowLimit, which signals the end of the data. The pagination loop in this tutorial handles that automatically.

Why am I getting a 403 error from the Search Console API?

A 403 almost always means the authenticated identity lacks access to the property. With a service account, confirm you added its email address as a user under Settings -> Users and permissions in Search Console. Also check that the Search Console API is enabled in your Cloud project and that your siteUrl exactly matches the property string, including sc-domain: for Domain properties.

Is the Google Search Console API free to use?

Yes. The API itself has no cost, and you're querying data for properties you already manage. There are usage quotas (requests per day and per minute), but they are generous for typical reporting workloads. For very large or historical needs, the BigQuery Bulk Data Export is the recommended path and incurs only standard BigQuery storage and query costs.

What scope do I need to read Search Console data?

For reporting you only need the read-only scope https://www.googleapis.com/auth/webmasters.readonly. It grants access to Search Analytics queries and property metadata without allowing any modifications. If you later need to manage sitemaps or property settings, you would request the broader https://www.googleapis.com/auth/webmasters scope instead.