Adults learn to swim in 6 Day phoenix Vacation - End of Fear! (Advertisement)

Java, the GPL, and Real Estate Listings


By Patrick   Follow   Wed, 15 Aug 2012, 12:34pm   1,575 views   17 comments
In Menlo Park CA 94025   Watch (1)   Share   Quote   Permalink   Like (1)   Dislike  

It occurred to me that the distribution problem with real estate listings is just like the distribution problem with software.

It would be best for sellers to get their listing the widest distribution possible. It would be best for buyers to be able to see all listings on every real estate site, rather than going to various sites trying to find them all.

But real estate listing sites like the MLS's, Zillow, and Craigslist all have harsh and restrictive terms of use that forbid the scraping of "their" listings and the integration of those listings with listings from other sites. They may even try to claim the copyright on your for-sale listing, like Craigslist does.

Those sites also have mutually incompatible input formats, so that a seller has to manually go around to each site where he wants to list his house and input the same data over and over again.

This is all analogous to the problem with software distribution, where the operating system (OS) vendors (Microsoft, Apple, and mobile phone OS's) all try to get programmers to write their programs for their platform alone by keeping their OS interface incompatible with all others. Writing for just one OS does not get the developer the maximum possible sales, and also limits the buyer's options. The programs he wants to buy may not run on the computer or phone he's got.

The solution for software is Java and the GPL. Java is an interface layer that means you can write a program in Java and it will run on any computer that has a Java "virtual machine", which is pretty much all computers. The Gnu Public License (GPL) is a clever use of copyright that requires that any distribution of software that uses any GPL-licensed software itself also be licensed under the GPL, which requires the developer to give away all the source code. In other words, if you get for free, you must also give for free.

Can I implement some form of these solutions for real estate listings?

Let's say I create the standard "Patrick.net Real Estate Listing Format" (PRELF?) which is a dead-simple standardized way to describe your property, maybe starting like this:

street address: 123 Shady Lane
city: Minneapolis
state: MN
zip: 55199
br: 3
ba: 2
...

Anyway, something REALLY simple and standard that any idiot programmer could instantly parse and slurp into a database. I could use some RSS format, but even that is more complex than I'm thinking.

The other critical piece would be the Patrick.net Public License (PPL) which would require only that:

1. Every site that includes a PPL-licensed listing agrees that that listing and ALL of their other listings are available for free to everyone, for any use, by whatever automated scraping mechanism.

2. The PPL notice itself be included in every derviative listing.

That's about it. Think it could work? It worked pretty well for software. I wouldn't make any money from it, but Patrick.net would get fame as the place that unified real estate listings to benefit both buyers and sellers.

Most Liked Comments

  Sort by time instead  
  1. Biff Baxter


    Follow
    Befriend
    65 threads
    188 comments

    1   6:18pm Wed 15 Aug 2012   Share   Quote   Permalink   Like (1)   Dislike  

    I think you better stop doing free shit. It doesn't pay enough. Either that or stop showering, grow your hair long, start smoking a lot of dope and join a drum circle.

    Biff

  2. Dan8267


    Follow
    Befriend (17)
    785 threads
    7,972 comments
    Boca Raton, FL
    Premium

    2   8:32pm Wed 15 Aug 2012   Share   Quote   Permalink   Like (1)   Dislike  

    The problems with real estate listings are

    1. Restrictions on use of the information as you pointed out.
    2. Incorrect information due to sellers including realtors deliberately misrepresenting properties. For example, "accidentally" listing a condo as a single family house.
    3. Incomplete information such as the history of the house, how many times it was listed, how long it has been on the market, and other things sellers don't want prospective buyers to know.

    The process of translating someone's data to a format you can use is called Extraction, Translate, Load or ETL. It's not a hard problem as data warehousing companies do it all the time.

    It's not difficult to take a large CVS or other format file and transform the data so that it matches your database schema or to automate that process for various data venders you use.

    The real problems are the three I listed above. The first is the hardest to solve. If you have data that no one else has, it's valuable. As soon as everyone else has it, it's nearly worthless. So the only value you can add to the industry is how easy it is to access, search, filter, and read the data. But doing a real good job on that may be enough to run a business.

    Of course, you cannot charge users of the site for access since everyone has the data. Nor can you charge sellers for advertising their real estate because they can do that elsewhere and you don't yet have a large enough customer base. And even when you get a large enough customer base, anyone could scrap your data.

    So you'd have to rely on ad revenue from related industries like furniture, home repairs, A/C repairs, plumbers, local restaurants, etc.

    That brings us to problem 2. Real estate sellers will want to lie about the property. To make your site better than the competition, you must detect and punish such lies. To do this, you must first make all sellers identify themselves with proof before posting properties. A credit card post can accomplish this.

    Then if a seller misrepresents a property, all the seller's postings are unpublished and the seller cannot publish more properties until paying a sizable fine, say $1000. You can crowdsource the detection of the misrepresentation like Craig's List does. Just put a link to "report post" and then let the user select from a list of reasons the post should be flagged.

    For the third problem, you'd have to make entering certain information mandatory for publishing a property. For example, the seller must select the zip code associated with the property. You'll have a map of zip codes to tax property appraisers and their websites where you can pull down sales information.

    Whenever you come across a zip code who's tax property appraiser you don't yet know, the seller must enter the appraiser's information including the URL to the tax appraiser's site before the property is published. In effect, you crowdsource the research of county tax appraisal sites.

    Then whenever you list a property, you can link to the correct site, or time permitting, pull in the data asynchronously and display it on your page. Doing the later will require some AJAX and some page parsing (regular expressions will help).

    But, of course, like all websites you have a chicken and egg problem. Buyers won't be interested in your site until you have content from sellers and do the above features so that it's worth viewing that content on your site rather than others.

    However, sellers won't bother jumping through any hoops until you have a large base of customers. So you'd have to jump start the website by gathering content yourself and acting like a seller for properties you aren't getting a commission on.

    Dating sites solve this problem with fake accounts. I wouldn't recommend that for real estate though.

  3. The Original Bankster


    Follow
    Befriend
    4 threads
    342 comments
    Phoenix, AZ
    The Original Bankster's website

    3   8:13pm Thu 16 Aug 2012   Share   Quote   Permalink   Like (1)   Dislike  

    the asymmetry of market information is what makes RE a lucrative market.

    If we did as you suggest then it would cease to be an investment and start to be places to live.

    this might be a good thing.

  4. APOCALYPSEFUCK is Shostakovich


    Follow
    Befriend (28)
    182 threads
    4,620 comments
    Premium

    4   11:01am Fri 17 Aug 2012   Share   Quote   Permalink   Like (1)   Dislike  

    Right on

    The MLS should be be reduced to an RSS feed and a terminal file format, XML would work fine with a standardized schema.

  5. Patrick


    Follow
    Befriend (54)
    5,240 threads
    6,175 comments
    46 male
    Menlo Park, CA
    Premium

    5   3:01pm Wed 15 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Kamesh Kompella says

    From what you mention, none of these sites provide you an API equivalent. They simply won't let you do anything besides work with the stuff they write.

    Right, none of those sites provide an API! So I'd provide a standardized listing format, and if sellers used it, then all those listing sites would have an easy way to slurp in the PPL-licensed listings.

    So they'd be adopting a standard API in exchange for getting the listings, but once they take on PPL-licensed listings, they have agreed to make all of their listings available under the same terms.

    Even if they don't format their own listings correctly, they've still given permission for someone else to do it.

    Kamesh Kompella says

    Let me ask you this. What does it take to get hold of the listings *without* using these sites? Where exactly do these sites get their data from? Can you go to the source?

    You can't. All the terms of use of all listing sites try very hard to prevent the free flow of listing information, usually via their terms of use, but also by tricking users into giving away the copyright on their own listings.

    Kamesh Kompella says

    Can you get hold of this kind of data through other methods?

    No. That is exactly the problem to be solved.

  6. 37108605


    Follow
    Befriend
    1 threads
    1,055 comments

    6   7:53am Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    I have a serious problem with all these listing sites and rating real estate so-called "value" sites. They appear to me to average out sales. One real or false positive listing or so-called "sale" appears to skew the entire board of any of them, real or not.

    I rely on my own skills of what a place is really worth to me based on my own knowledge not what some online board or person tries to tell me it is worth.

  7. uomo_senza_nome


    Follow
    Befriend (13)
    787 threads
    1,542 comments

    7   11:17am Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Patrick says

    all listing sites try very hard to prevent the free flow of listing information, usually via their terms of use,

    So in other words, data monopoly.

    How would we break the vicious cycle? The data monopoly already has more than 90% of buyers/sellers cornered, right?

    Patrick says

    and ALL of their other listings are available for free to everyone

    I don't know why they would agree to this term, if it is not in their best interests to relinquish their data monopoly.

  8. Patrick


    Follow
    Befriend (54)
    5,240 threads
    6,175 comments
    46 male
    Menlo Park, CA
    Premium

    8   4:19pm Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    uomo_senza_nome says

    So in other words, data monopoly.

    How would we break the vicious cycle? The data monopoly already has more than 90% of buyers/sellers cornered, right?

    Yes, it's a data monopoly.

    We break it the same way the Microsoft monopoly is losing ground to Java and the GPL, not to mention Apple.

    We just have to provide more value to sellers to that they will list under the PPL.

  9. David9


    Follow
    Befriend (3)
    28 threads
    772 comments
    Tarzana, CA

    9   5:03pm Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Redfin doesn't cover all the markets, but the coverage they do do is rather complete. How do they accomplish this?
    And would you be able to do it?

    Hey, I don't fully understand the Java and GPL but you seem rather comfortable with it, so go for it, start with the Bay Area.

    (And yes, I'm feeling the creepy stalker thing too, but hey, I just found this new icon 'housing trap', kinda freakin me out too, ok? )

  10. Patrick


    Follow
    Befriend (54)
    5,240 threads
    6,175 comments
    46 male
    Menlo Park, CA
    Premium

    10   5:08pm Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    David9 says

    Redfin doesn't cover all the markets, but the coverage they do do is rather complete. How do they accomplish this?
    And would you be able to do it?

    They agree to play along with the current screw-the-buyer system, so they get access to MLS data. Part of the terms are things like never showing cheap foreclosures next to full-price houses on the same web page, even if they are the same model, across the street from each other. And never allowing negative comments on property or on realtors.

    I like your new icon. Where'd you get it?

  11. David9


    Follow
    Befriend (3)
    28 threads
    772 comments
    Tarzana, CA

    11   7:29pm Thu 16 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Patrick says

    I like your new icon. Where'd you get it?

    Thanks for the validation. :)

    On this webpage, also where I got the idea for the book title

    http://mikeroeconomics.blogspot.com/2012/06/housing-trap.html

  12. kapone


    Follow
    Befriend
    5 threads
    93 comments
    Rockville, MD

    12   7:08am Fri 17 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Patrick - This is not a "technology" issue. This is a "big money" issue. You have technology but not big money, i.e. you can't compete.

  13. uomo_senza_nome


    Follow
    Befriend (13)
    787 threads
    1,542 comments

    13   8:34am Fri 17 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Patrick says

    We just have to provide more value to sellers to that they will list under the PPL.

    Sellers have to pay the real estate agent's company for using their service which gets the largest possible audience. When big money is involved, how would they voluntarily give up?

    I think your intent is to eliminate the need for a real estate agent, which is a good thing. But the whole market is already cornered. It is like trying to provide a free Windows version when the paid Windows version has already monopolized the OS market and patented the whole thing.

  14. Patrick


    Follow
    Befriend (54)
    5,240 threads
    6,175 comments
    46 male
    Menlo Park, CA
    Premium

    14   10:04am Fri 17 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Yes, it's very much like trying to compete with Windows by providing a free operating system.

    Which has been done! Look at Linux. Maybe it's not on your desktop, but it completely dominates the server side. Almost all web servers are Linux now.

    So there just needs to be an alternative for sellers, so that they no longer have to pay an agent to get the best price for their house.

    kapone says

    Patrick - This is not a "technology" issue. This is a "big money" issue. You have technology but not big money, i.e. you can't compete.

    It's definitely no technology, but it's not big money either. Linux took over all web servers with no marketing budget at all.

    It's possible to compete if I'm not out to play the same game. I just need to play a slightly different game.

    Do sellers really want maximum exposure for their listings? Maybe they feel other things are more important, like preventing public comments on their property, or the ability to erase the asking price history.

    If they just want maximum exposure, then some kind of free open PPL'd syndication of listings is the way to go.

  15. uomo_senza_nome


    Follow
    Befriend (13)
    787 threads
    1,542 comments

    15   10:42am Fri 17 Aug 2012   Share   Quote   Permalink   Like   Dislike  

    Patrick says

    Look at Linux. Maybe it's not on your desktop, but it completely dominates the server side. Almost all web servers are Linux now.

    True. But reliable support, training and integration services come at a cost. (e.g., Red Hat). And companies using Linux servers need that support.

    I think if what you offer has serious value for sellers, you need to charge something for it. That gives an incentive for you (the product developer) to make it attractive as opposed to competition and to the users of the product, who need to pay much less than what they shell out today for current products (e.g., MLS).

  16. Kamesh Kompella


    Follow
    Befriend (1)
    4 comments

    16   2:16pm Wed 15 Aug 2012   Share   Quote   Permalink   Like   Dislike (1)  

    Your argument of java and your Java listing equivalent is flawed, IMO. Java worked because Windows and other OSes allowed you to write software so long as you programmed to their interface. The key word is API.

    From what you mention, none of these sites provide you an API equivalent. They simply won't let you do anything besides work with the stuff they write. That is where the story is over. If they allowed you to get data or screen scrape, then your idea has life. The key enabler is that Zillow et al, need to share their toys. Right now, they have all gone home and you are standing alone in the playground.

    Let me ask you this. What does it take to get hold of the listings *without* using these sites? Where exactly do these sites get their data from? Can you go to the source?

    Can you get hold of this kind of data through other methods? And if you do, can you let me know too, please.

  17. omgbacon


    Follow
    Befriend
    82 comments

    17   2:26pm Wed 15 Aug 2012   Share   Quote   Permalink   Like   Dislike (1)  

    sounds like the conversations that people were having back in the mid 90s around converting proprietary EDI data formats to standardized XML formats for better b2b interoperability.

Premium member Patrick is moderator of this thread.

Email

Username

Watch comments by email
Home   Tips and Tricks   Questions or suggestions? Mail p@patrick.net  

Page took 557 milliseconds to create.