How to scrape French Listings data on SeLoger with no code?
Overview
SeLoger is a top-tier French Real Estate website. It registers each year thousands of thousands of fresh listings, originating mainly from professional publishers. According to this excellent article from Immomatin, published in February 2020, SeLoger is the second most popular Real Estate website in France, with 5.74 mio. in November 2019, against 12.00 mio. for Leboncoin, and 3.47 mio. for Bien’Ici.On top of being highly popular, SeLoger is providing extremely qualitative datasets — with strong and relevant pieces of information. Of course, we can find some usual Real Estate-related data points, such as area, price, or neighborhood, as available below:
We'll also find some advanced and extremely valuable elements, such as availability date, kind of exterior area, kitchen type, air conditioning…
Why not scrape all these valuable data points with a quick Python script? Let’s code something quick, time is precious…
import requests
r = requests.get("https://www.seloger.com/")
with open('response_seloger.htm', 'w') as f:
f.write(r.text)
Though, while opening the answer… what a horror!
In these conditions, how can we collect data at scale on SeLoger, with no code? For free? Is it even possible?
Setup
Fortunately, our fantastic development teams did deploy for you a tremendous ready-made crawler. First, go on our dedicated crawler page, available just here:
https://lobstr.io/store/a7e1864ab37570369c69a68d1b943d8b/seloger-iter-listingsAnd simply click on ‘Start Now’. Here we go!
If you click on the small icon, just beside ‘Output’, you can download a 100-lines sample. One-click. It’s free. You’re welcome.
Then, go on SeLoger, and choose all the filters you do need: type of real estate structure, location, price filters… Once complete, copy-paste the URL of your browser. And keep it preciously e.g.
https://www.seloger.com/list.htm?projects=1&types=2&places=[{%22divisions%22:[2248]}]&mandatorycommodities=0&enterprise=0&qsVersion=1.0&m=search_hp_lastLet’s go to Corsica, summer is magic.
⛵
Now, let’s shift back to our lobstr area — and let’s simply copy-paste the URL we just saved (1), then let’s simply click on Save (2):
If you click on ‘Add Input’, you can track listings from several search URLs in a row! Whereas you can play with ‘Max Pages’ advanced settings, to limit scope of collection. All is said.
Endly, since we want to launch our crawler only once manually — it’s a demo, right? — let’s symply click on Manually (1) and Save (2):
Launch
Time has come… let’s launch our awesome collection machine!
Simply click on the ‘Launch’ button, on the top-right corner of the screen:
And here we go! The crawler is now running at full speed — collecting approx. 20 listings per minute:
With 15 minutes free per day (forever ever), you can thus collect not less than 300 fresh listings per day. It’s substantial. It’s free.
Enjoy
After a couple of seconds, the data is fully available — and hopefully juicy:
Simply click on the top-right red big button — the elephant in the room — and enjoy you’re complete Seloger dataset. A dataset you did collect with no code, in a couple of seconds, for free.
When opening it with Numbers, you’ll enjoy the exhaustiveness and quality of a fully clean, structured and usable set of data:Finally, we did collect 17 listings, in precisely 88 seconds, or a steady collection speed of 6 listings per second. Clean. Fast.
You’ll benefit each day with 15 minutes of free collection. Forever. If you need higher plan, feel free to check our pricing table, with extremely competitive offers, starting at 20 EUR per month, with 1h of data collection per day.
Conclusion
SeLoger is an undisputed stream of data for Real Estate in France, specifically for professional-related listings. Although protected by advanced bot-mitigation solutions, you’ll be able with lobstr, to collect data at scale.Code-free. Money-free. In a couple of seconds. Happy scraping!
🦞
Co-founder @ lobstr.io since 2019. Genuine data avid and lowercase aesthetic observer. Ensure you get the hot data you need.