- The GTM Cookbook
- Posts
- How to Scrape (Almost) Anything
How to Scrape (Almost) Anything
Welcome back to Edition #2 of the GTM Cookbook 👨🏼‍🍳
I wanted to outline a problem this week that I think just about any GTM operator has faced at some point- scraping the exact leads you want to aggregate.
Not every lead list is in Apollo, ZoomInfo, 6Sense, or any of the other traditional data providers. Sometimes, they’re on niche websites, in government databases, and across the internet in a semi-structured way. This guide is going to show you how we scrape leads for our clients from wherever they want us to scrape.
At The Kiln, we recently scraped the entire FMCSA database for a client using these methods, which resulted in more than 1 million records.
In short, the simplicity or complexity of a scrape is determined by how structured the data is, as well as how badly a website doesn’t want you to scrape it. We’ll list every method and when you should use it, just to make things easier.
Also, our very own Ankit Singh put the entire technical side of this guide together. He’s our in-house scraping expert, a technical wizard, and someone you should definitely follow if you’re interested in learning about the deeply technical side of GTM.
Without further ado, here’s the guide to scraping just about anything:
1. Clay Chrome Extension

Yep, Clay has a Chrome extension that allows you to scrape websites and import the data straight into Clay. In my early days at Clay, I was tasked with mapping out hundreds of pages to help users get more out of the tool. It’s quite simple to use, and is helpful for basic lists like the example above (Y Combinator’s Website). It’s also free, so worth a try before going to more complex cases.
Use Case: Best suited for simple scrolling and scraping.
Limitations: Does not support pagination.
Example: Scraping Y Combinator website data.
Loom on how to use it: https://www.loom.com/share/039814ea5f3f45a2a36021eaf5d8ec03?sid=dc96cfcd-701c-4e31-8655-3a302193857e
More Information:Clay for chrome
2. Phantombuster
Phantombuster is a great tool for scraping specialized things such as LinkedIn followers and engagement. They have individual scrapers called “phantoms” that serve a specific scraping task, which you choose from, follow the connection instructions, and just let it run. It’s super easy to use and great for specific kinds of scraping tasks. I highly recommend you check out their Phantoms List to see if it could help for your use case.

Use Case: Specialized in scraping data from LinkedIn, Instagram, X (Twitter), including:
Likers & commenters on posts.
LinkedIn search results.
Extracting LinkedIn profiles with specific keywords in job titles.
Example: Scraping people with the title "Clay" who are following Patrick Spychalski.
Loom on how to use it: https://www.loom.com/share/4ee428c5df4b4bad92d8760b686031c2?sid=a0cee6fe-d092-4f68-83a1-59a15caaafd1
More Information: Phantombuster
3. Apify

Apify is another marketplace of scrapers that allow you to complete specific scraping tasks. They call these “Actors” and it’s very easy to run them as well as connect them to Clay via their native integration. Check out their store here, or you can create your own.
Use Case: A marketplace of scrapers (actors) that supports a wide range of platforms, including:
LinkedIn Jobs
Indeed Jobs
Crunchbase
Facebook Pages
Instagram Pages
Meta Ad Library
Many other web sources
Example: Scraping CrunchBase for company data.
Loom: https://www.loom.com/share/85c9f1c08c5940239122c2c0231c3228?sid=af58b77b-e0dd-43d2-8d38-745dbf18a8f4
More Information: Apify
4. Octoparse

Octoparse is the first true scraping tool that can be used for complex scrapes and essentially any website. It allows you to create custom scraping actions that can get past specific barriers such as 2FA, weird clicking patterns, and more. I highly recommend you check out Ankit’s loom below for a quick rundown on how to use it.
Use Case: Suitable for both simple and complex websites requiring:
Logins
Two-Factor Authentication (2FA)
Workflows for clicking actions
Pagination
Infinite scrolling
Custom selection of elements
Example: Scraping Amazon product data.
Additional Use Case: Supports login via cookies for session-based scraping.
More Information: Octoparse
Loom: https://www.loom.com/share/5702cf065c91483f97d45d7c3eae8516?sid=7fc584b6-7779-447b-a708-db088540a6eb
4. Python (Selenium & Beautiful Soup)

Where all else fails, a well-executed Python script can almost always do the trick.
Use Case: Provides maximum flexibility in web scraping by allowing full control over:
Automating browser interactions using Selenium.
Parsing HTML content with Beautiful Soup.
Running Chromedriver for handling dynamic content.
Loom: https://www.loom.com/share/e7136d5ab9a943c6bf428132feed8c91?sid=f8a650fc-400c-4746-b04b-20bbb1fbaf88
I hope this added some value for you, and feel free to reach out with any questions!
and to wrap things up, if you’re looking for the influencers template I posted about two days ago, here you go! → https://app.clay.com/shared-table/share_F9RWGq4bDbhB?via=b8a689