How to Scrape Yelp for Free in 2024 [No-Code]
With 135 million monthly visitors, Yelp is the most popular business directory in the US. In this article, we'll learn how to scrape Yelp listings for free, without coding.
But why'd I need to scrape Yelp in the first place? What can I do with the Yelp dataset?
Scraping Yelp has countless benefits for entrepreneurs, marketers, researchers, and businesses across various industries. Here are some key use cases:
OK I get it. Scraping Yelp is highly beneficial for me. But it can land me in some serious legal trouble. What if it’s illegal to scrape yelp?
The short answer is yes. But it’s tricky and depends on the question. If you ask, does Yelp allow scraping its data?, they’ve answered this question on their support center.
So does it mean scraping Yelp is illegal? Well not exactly.
The terms “NOT ALLOWED” and “NOT LEGAL” are not the same. As u/PM_ME_SOME_ANY_THING and u/mrIjoanet said in this reddit post.
Although Yelp prohibits scraping explicitly according to their terms of services, it doesn’t make web scraping illegal. Web scraping is legal if done within certain boundaries.
Read our detailed article on legality of web scraping.
You can scrape publicly available data without harming a website or user privacy. But make sure you comply with data protection laws like GDPR and copyright laws.
In other words, even if prohibited by the platform, it’s not illegal. As long as we don’t overload the website and respect their privacy guidelines, scraping Yelp is fully legal.
Or as some dudes on reddit say:
Though I don’t wanna 🤬 Yelp like Mr. u/SanFranLocal, the point is, you can scrape Yelp legally. If you’re still uncertain about it, you can initiate a legal inquiry with Yelp.
But why f##k em if we could use the official API? Let’s find out.
To find the API, I visited Yelp’s developers portal. Among various other tools and APIs, I found 2 ways to access Yelp data. First option we have is Yelp’s open dataset.
If you want sample data, this option may be suitable for you. But it’s limited. You can access only 150k business listings from 11 metro cities only. So for commercial use cases, this one’s a NO NO.
The 2nd option is Yelp Fusion API. This API offers almost every type of data. From business listing attributes to user reviews, it has all you need. The documentation is pretty solid too.
Then why aren’t we using it? Well Fusion API has limitations like:
Firstly, the API allows only 500 requests per day. This isn’t enough for business use and Yelp doesn't allow commercial use of Fusion API either.
For commercial use, you need to apply for Fusion Enterprise API. But it has a complicated application process and no transparent pricing plan available.
Second limitation is the number of supported locales. You can’t extract data from all regions. This API only supports a limited number of regions.
Lastly, it doesn’t show many listings that may appear in yelp search results. Any listing that doesn’t have a review is not fetched by this API.
So what other options do we have? Well we can either code a web scraper or use a ready-made script from Github. But this isn’t reliable as Yelp can ban your IP.
If you’re curious about how to code a Yelp scraper, I’ve got you covered. Check out this amazing article on how to scrape Yelp using python and requests.
But not everyone is a nerd. Not everyone can read HTML, CSS, JavaScript gibberish and code a perfect web scraper. For context, look at this Yelp results web page:
Second option is to use a no-code solution. But there are too many no-code scrapers on the internet. Which one is best? Well I’ve compared the 5 best yelp scrapers for you.
In this tutorial, I’ll use Lobstr.io, the best Yelp scraper to extract business listings without coding. Let’s roll.
I'm going to use Yelp Search Export automation from Lobstr.io's store. But first, let me introduce you to some key features of this awesome scraping tool:
So lobstr.io is cool, but how much does it cost?
Lobstr.io offers a really flexible and transparent price range. You can opt for:
OK dope, now let’s learn how to scrape Yelp for free with Yelp Search Export.
We’ve also published a detailed piece on scraping Google Maps without coding. Check it out here.
There’s nothing nerdy about Lobstr.io. We’ll scrape Yelp business listings in 6 simple steps.
I’ll be scraping all the Restaurants in San Francisco listed on Yelp. Let’s go 💨
First and foremost, we’ll get the URL of the Yelp search results page. Go to Yelp, enter location and keyword. Then copy the URL from the address bar.
Here we go, our URL is:
https://www.yelp.com/search?find_desc=Restaurants&find_loc=San%20Francisco%2C%20CA
We can also split the job in multiple URLs to get more precise results. Let’s filter the search results by neighborhoods. I’m going to target these 4 neighborhoods:
Let’s apply the filter and copy the URLs. Here we go.
Now let’s move to step 2.
To create a squid, you’ll need a Lobstr.io account. Don’t have one? It’s free to create an account. You don’t need to enter any credit card information. Go sign up now.
Once you’re in, creating a squid is easy. Click the Create New Squid, and type “yelp” in the search bar. Click the “Yelp Search Export” squid and you’re ready to configure it.
This will take you to the tasks area. Let’s feed the URLs to our scraper.
In Add tasks, we’ll add our Yelp search URLs. You can add manually by pasting one by one and clicking the add button.
But what if I have 100s of URLs? Luckily Lobstr.io supports file upload. You can upload the URLs in bulk by saving them in a txt, csv, or tsv file.
Let’s add URLs using the upload file option. Make sure at least 1 column is url.
Great. All 4 URLs imported. Now let’s explore the settings.
In basic settings, you can choose maximum results to scrape per URL. By default it’s 240. If you want to scrape less than 240, this option is for you.
Next is the “when to end the run” option. You can choose whether you want to end the run when all your daily credits are consumed or when all the tasks are completed.
If you select the second option, your squid will pause once all your daily credits are consumed. The task will resume the next day where it left off.
I have already explained the when to end run feature and how pause and resume works in this article.
You can use the Advanced Settings to set the concurrency i.e. number of bots launched simultaneously on your squid. The formula is simple: more concurrency = faster scraping.
Free plan allows you to launch only 1 bot. You can upgrade to premium or business plans to add more bots.
Toggling Unique Results will remove duplicates from the data and No Line Breaks will remove line breaks from the text fields, making it easy to export to Excel.
Once you’ve configured the settings, click save and move to Notifications. This is where you can set real time email notifications.
You can choose to receive notifications when a task is successfully completed or when the squid encounters an error. Save your preference and move to the final step.
In the final step, we choose how to launch the Yelp scraper? We can launch the scraper manually by clicking the Save and Extract button. It’ll instantly start collecting results.
But what if I want to scrape regularly? A cool way to do it is by automating the scraper using the schedule feature. You can schedule the squid to run Daily, Weekly, or Monthly.
I’ve explained how the schedule feature works with a solid example in this article.
After choosing your launch preferences, you’ll be redirected to the console. This is where you can see extraction progress in real-time. You don’t have to keep this Window open.
Close the window or even the web browser, check back later for results. You’ll receive an email when the job is finished. You can download the results as a .csv file.
But I prefer viewing results in Google sheets. Oh, did I tell you how to use the delivery option? Almost forgot 😞. As mentioned earlier, you can export results directly to 3rd party services.
To configure your export, click the delivery button and select your desired delivery method. Don’t forget to tick the checkbox, to make sure all the data exports to the selected service.
So we extracted 200+ results from 4 different neighborhoods in San Francisco. Our boi brought us all the business data including contact information.
🥳
You can view the downloaded data in Excel or any other spreadsheet application. Or convert it to JSON or any other format you like.
Yelp only shows 240 results per search query. So this scraper is also limited to 240 results per URL.You can’t scrape more than 240 results from a single URL.
You can split the job into multiple URLs. Just like I did by applying the neighborhood filter. You can use other filters as well. Enable the unique results filter to remove duplicates from the final output.
Lobstr.io has its own proxy network. You don’t need any VPN, deal with captcha, or worry about other security measures to avoid being banned.
No, while using Lobstr.io, your IP is not at risk. Since the scraper runs on cloud with Lobstr’s own proxy network, you’re not at risk of IP banning.
This scraper shows review count and star rating only. It doesn’t scrape review text for you. If you want to scrape Yelp reviews, you can submit a request here.
That’s it. This was our quick but complete tutorial on scraping Yelp listings without coding. We scraped Yelp without getting banned, and completely for free.
Try Lobstr.io, it’s free. You don’t need to add any credit/debit card information. There’s no limited trial. You can upgrade whenever you want and scrape without getting banned.
Happy Scraping 🦀
Self-proclaimed Head of Content @ lobstr.io. I write all those awesome how-tos, listicles, and (they deserve) troll our competitors.