by Patrick ➕follow (61) 💰tip ignore
« First « Previous Comments 23 - 62 of 510 Next » Last » Search these comments
all the for-profit real estate sites except Craigslist are very hostile to being scraped.
Since when has Craigslist made a profit? Or even tried?
I talked to a lawyer about this years ago, and they assured me that I would definitely be sued if I tried to scrape, say, realtor.com.
I think that only applied if you get caught! I don't know the legal stuff, but I know that a hell of a lot of companies scrape websites especially Facebook, LinkedIn, MySpace and they don't respect robots.txt file. Just think about all the blatant copies of the same text you find in Google searches.
Now I'm not advocating scrapping a site whose terms of use forbids it, but it definitely happens a lot. Also, I don't think spiders can agree to terms of service, so it's not breach of contract. And then there's Google cache. What happens if you just copy data from Google's cache?
That's why I'm glad I'm a developer and not a lawyer. Seems like law is about about political connections and who has the most money to spend.
Still, at least you are safe with county property sites. Government sites can't prevent scrapping since the Freedom of Information Act guarantees your right copy the documents. That much I do know.
I've heard that Craigslist does make a fair amount of money off of its job boards. Maybe they would mind if someone scraped those.
I know there's a whole lot of scraping going on, but I'm a bit paranoid about realtors trying to shut down my site. I know some of them hate me, and I already got a threat from the NAR that I could not use "What your realtor won't tell you" as my tag line.
The problem with county sites is that there are 3,143 counties, and all their websites are unique.
Patrick, I'm assuming you are able to figure out going rent rates for areas.
I guess instead of just affordable (since affordable means different things to different people) - you could have a questionaire where people could put in the low and the high of what they can afford to spend and you could list the areas they might want to search.
Yes! I do have a lot of data about going rental rates.
I could create just the sort of thing you mention, where people put in the low and high of rent and it gives them areas to look in, and maybe even specific rentals.
I like the idea of breaking up monster houses into shares. Might be a good compromise between being alone and having too many people around. I suspect the limiting factor will be bathrooms. Most people really want to have their own bathroom.
I've heard that Craigslist does make a fair amount of money off of its job boards.
Did not know that. I thought the founder, Craig, wasn't even trying to make money off the site. He turned down so many offers to buy the site because he was more interested in performing the service to the community. But that was many years ago during the .com bubble.
The problem with county sites is that there are 3,143 counties, and all their websites are unique.
Yes, that is a problem. Which is what makes it a great opportunity. It's a barrier to entry. If a person did collect all the information from every county, then that person would have a unique data set that was more useful than the publicly traded data out there.
Pretty much, something has to be hard to be worth doing because if it was easy, many other people would be doing it already or soon after you did it.
There's an old adage that applies. Industry pays for increased value.
Still, there may be an easier way than scaping to get the data. You could fill out 3,143 Freedom of Information requests asking for a complete copy of their database as it is. Since the requests are all the same except for a few fields (address, name of county, etc.), you could automate a lot of that.
Then you either postal mail or email the request depending on which is available. Obviously email is cheaper and easier. Ideally, they would give you a URL where you could download an archive rather than shipping you a disk or flash drive and requiring that you pay the cost of that. Downloads should be essentially free and their database isn't necessarily that large. A few GBs could hold a lot of data on properties. I've done some ETL work in the past.
Speaking of ETL, that would be the final step. Unfortunately, you'd have to do it once per county. Updates would be easy, but you'd need to have a parser for each county anyway. Best case scenario, the parser's a simply XML file that maps fields and maybe defines some record splitting/merging or field manipulation. But doing anything 3,143 is a pain in the butt.
There are ways in which you could automate the generation of the parsers based on the contents of the data files you get. A sophisticated algorithm could reg-ex pattern match and statistically analyze the data in each field to determine the field's meaning and thus where to map it. So if your willing to invest time in developing that software, you could greatly reduce the number of parsers you'd have to write yourself to just a few to get started.
Pretty much, something has to be hard to be worth doing because if it was easy, many other people would be doing it already or soon after you did it.
Very true. Though hard may just mean being in the right place at the right time. The Million Dollar Home Page was not technically hard, but it was hard to think of, and once it was done, it became impossible to make the idea new again. So really only the first guy could do it.
Still, there may be an easier way than scaping to get the data. You could fill out 3,143 Freedom of Information requests asking for a complete copy of their database as it is.
They may have to give me the information, but they can also charge. San Mateo County wanted $50 for a CD of fairly recent transfer tax data, from which you can figure out sales. Multiply by 3,143 counties.
It would definitely be a competitive advantage to have all that data though. Some corporation must be doing this. I think some place named Ameridata used to have it all.
Ok Pat here goes. Movie reviews. I suggest another forum or category let people put the movie up and review. It's Calfornia guy and its fascinating to us outsiders. Nothing fancy. Something like That. Just a category for movies. Here I'll start it what do you think of ............... Titanic?
I'm afraid that each new forum cannibalizes attention from the other forums. I already worry that I have too many and it's distracting from my "don't overpay for a house" message. Also, how do you make money from a movie review forum? I guess by advertising movies! So I bet that idea would work standalone, but it's kind of off topic for a real estate blog.
In fact, I realize it's a good enough idea that there is already a successful site that does it well: http://www.rottentomatoes.com/
Patrick, the estimated rents are just way too low when I tried to add a house. Why can't I edit the estimated rent myself? What is that based on? How come square footage is not even considered??
I was simply trying to enter the estimated rent based on the rates I'm paying myself... the values suggested by your tool were 30% and 50% below actual market rates for the two properties I tried to enter...
You can change them in the calculator and recalculate, but that doesn't change the default estimated rent I put in when I first scrape the listing from Craigslist.
That default estimated rent is just the median of the 40 closest rents which have the same number of bedrooms. Doesn't take anything else into account.
Should I allow users to enter new listings on http://patrick.net/housing/forsale.php?
That would also give me some way of getting square footage, which I don't have any good way to get right now.
Multiply by 3,143 counties.
Perhaps the Department of Housing and Urban Development has a copy of property records. I'm not sure what kind of information they have, but it might be worth checking out.
The Foia.gov site is a portal to all Freedom of Information Act requests directed at the federal government. It has a page where you can make requests for records held by any federal agency including HUD.
OK, I'll make a movie review forum and we'll see how it goes.
Done. It's right above the Miscellaneous forum. Please write a review!
If the search function allowed fields (e.g., thread title, hashtag, comment, user) and filters (e.g., date) that would help. Currently, it seems to combine title and OP text and hashtag, and then offers to switch to a comment search, both of which yield haystacks of results.
Also, if the "new post" function checked for "new" posts that include an already posted URL, or keyword (e.g. Hyperloop), that might help too. Tovbot tends to spew new posts without proofreading, e.g. two within six minutes today about the same article, and has posted more than a dozen different threads about Hyperloop, for example.
@Ironman thanks, that does seem to be a real bug. I'll work on it now.
ok @Ironman it should be fixed now.
the problem was that i decided to use the exact same tabs on home page, search, and user pages.
when i did that, i changed the home page to use a single sql statement with an $order_by for sorting by tab. but that single sql statement looked only at posts created in the last 7 days (post_date). the fix was to use post_modified instead of post_date for the special case of the "active" tab on the home page:
// for home page "active" tab alone, order by post_modified so that recently modified old posts pop to top of home page
// for other tabs, just look at posts created in last 7 days (with post_date)
$timecol = $order == 'active' ? 'post_modified' : 'post_date';
$posts_sql = "select SQL_CALC_FOUND_ROWS * from posts where $timecol > date_sub(now(), interval 7 day) and post_approved=1 $order_by limit $slimit";
and then offers to switch to a comment search, both of which yield haystacks of results.
The number one rule of searches is to always AND terms, don't OR them. ORing is worthless. If a search for "Yankees" gives you too many hits to scroll through including civil war results, then the search for "New York Yankees" is even more worthless if the website ORs the terms. The entire purpose of adding more search words is to narrow down results, not expand them.
If you feel you must support ORing despite all reasoning, then do so explicitly with the OR keyword.
Quite frankly, I never use the PatNet search feature because of the ORing. I just use Google with the site option. For example, immense hirsute lesbian site:patrick.net. Google does a better job indexing your website than you will. And it's trivial to leverage Google inside your website.
Some idea, suggestions
To Get Users You used to send out emails once a week of curated stories. That probably drove decent traffic to the site. Maybe start that again
Monetization: having anchor advertisers would help sell the site as well as google adsense (which sounds like is gone as an option)
Small Design Change- keep current layout but perhaps have a bar that lists say the top 5-8 general topics, US Politics, real estate etc that opens and shows only those posts, let people subscribe to topics (see above)
Add share buttons
Add real estate widgets- home valuation etc.
Write another book! I bought the first one
@anonymous
No I don't think it's resolved, but I can't test because I don't have a Windows computer to work on, only Mac and Linux.
If you know any Javascript, you could just "view source" and also view the browser's Javascript console try to figure it out.
Sadly, Turtledove was helping me by telling me about some error in the Javascript console on her browser on Windows, but now I can't ask her about it anymore.
Some idea, suggestions
To Get Users You used to send out emails once a week of curated stories. That probably drove decent traffic to the site. Maybe start that again
Actually reading and selecting stories (which I did for years) turned out to be so much unpaid work that I gave up. I could send out an email of the top-liked or top-commented links pretty easily though.
Monetization: having anchor advertisers would help sell the site as well as google adsense (which sounds like is gone as an option)
How would I get anchor advertisers? Maybe I should just sell Patrick.net swag instead: cups, hats, shirts Even if sold cheap, it's good advertising.
Adsense revenue kept declining, so I gave up on it too.
Small Design Change- keep current layout but perhaps have a bar that lists say the top 5-8 general topics, US Politics, real estate etc that opens and shows only those posts, let people subscribe to topics (see above)
Yes, I do want to customize the site so the people can get tabs for topics they are interested in, and subscribe by topic. Just haven't done it yet.
Add share buttons
I have my own share button, but no one uses it. Literally no one! Not sure why. Maybe because it asks for an email address to share with.
Add real estate widgets- home valuation etc.
I did have my own valuation calculator, but others do it better, esp NY Times rent vs buy. Anyway, I want to really be a discussion forum more than anything else.
Write another book! I bought the first one
Need inspiration! Thinking of something like "Ten Politically Incorrect Truths".
Need inspiration!
The Medical Trap
Thinking of something like "Ten Politically Incorrect Truths".
That seems, perhaps paradoxically, too many and too few.
Please just keep in mind that while truth (meaning factual accuracy) is a complete defense to defamation, caricatures tend not to be truth. Technically correct may be the best kind of correct, but cherry-picking and confirmation bias can make a badly tendentious book.
please put links to the next/previous/etc page in a thread at both the top and bottom of the page.
@anonymous
Sadly, Turtledove was helping me by telling me about some error in the Javascript console on her browser on Windows, but now I can't ask her about it anymore.
Consider using one of the js error collection services. When js errors occur, they get automatically sent to a service for logging, where you can then go and view them. I can't recommend any as I've always written my own due to corporate intranet policy, but I never build web apps without js error collection anymore - there's always errors you're unaware of, and collection lets you find them. Users will rarely ever tell you about your bugs, especially if they don't happen often/repeatably, so being proactive yields much higher quality webapps.
Here's a list of what I'm talking about:
https://www.slant.co/topics/2615/~javascript-client-side-error-logging-services
They can automatically collect many errors via using the browser "error" event handler, so it's really easy & unobtrusive to install and use. But you get better results if you can wrap your code in special exception watching functions.
Oh the Don has changed his hair colourDan8267 says
Trolls do well not to piss off admins. This must be a new species of troll: suiciders.
I'm on Windoze and I don't have any problem with the "quote" function. I'm also using Firefox. Is it a problem only with Internet Explorer?
Thanks! Maybe it is only IE. That would be typical.
If you get a chance to check quoting on IE or Chrome, I'd be grateful.
Consider using one of the js error collection services. When js errors occur, they get automatically sent to a service for logging, where you can then go and view them.
Thanks @c1561490 this is a great idea. Then I won't have to ask people about errors in their particular browser.
Personal anecdote... when there is a post at the top of the page, that has the UNREAD icon signaling me to click it to pick up the conversation where i left off, and one of the new posts is by a user that has me on ignore, it sends me to the OP. Which is kind of annoying. You've toyed around a bit with the ignore feature, not sure where the bright idea came from for the user ignoring someone posts are not visible.
not sure where the...idea came from for the user ignoring someone posts are not visible.
IIRC, that originated with Typhoid Marcus, who (mis)uses Ignore as part of a dysfunctional game of tag, even using another browser to check the comments of Users (s)he pretends to Ignore. As one would expect from that troll, combining "Ignore" with "Hide from" was a sily idea producing nothing but dysfunction and annoyance, which was the goal.
please put links to the next/previous/etc page in a thread at both the top and bottom of the page.
@c1561490 your wish has been granted.
Links to other pages of comments are now at both the top and bottom of a thread (aka a post):
"« First « Previous Comments 15-54 of 54 Last »"
Personal anecdote... when there is a post at the top of the page, that has the UNREAD icon signaling me to click it to pick up the conversation where i left off, and one of the new posts is by a user that has me on ignore, it sends me to the OP. Which is kind of annoying.
@errc OK, now that should be fixed. Thanks for pointing it out.
Here are the actual code changes, in case you're into that sort of thing:
https://github.com/killelea/patrick.net/commit/2a13bdda46f344d96ad29e422c668ddd9a5fcba9
@patrick nice, thanks
What is the reasoning behind having a person who's on ignore, being blocked from seeing the posts of the poster that ignored them?
Seems all #Safespaceish
What is the reasoning behind having a person who's on ignore, being blocked from seeing the posts of the poster that ignored them?
Ah, there is a reason!
I talked to a friend who worked at Facebook about this, and concluded that people are not comfortable with one-sided ignore. The problem is that if you ignore someone, but they can still see what you wrote, then they can respond to your writing with mockery and insults, and you will not easily be able to reply to them. Or even know that they are mocking you in public.
So it seemed that the best thing to do was to make ignore mutual. If you ignore someone, you simply disappear from their radar and they disappear from yours. If you really can't get along, that means one side or the other is not trying to get along, and so the best thing is just to chill out and not talk for a while.
Meh, you blow up any semblance of Free Speech forum by conveying that kind of power to any one special snowflake. Personally, I'd fade anything I gleaned from anyone even loosely associated with Facebook.
And just because some pussy is afraid of what I have to say, doesn't automatically mean that I couldn't benefit from something that they have to say. Besides, with your approach, it's quite obvious that anyone so infantile as to engage in that crap here, can work up a two second work around. And your left with a less than working resolution.
please put links to the next/previous/etc page in a thread at both the top and bottom of the page.
@c1561490 your wish has been granted.
Links to other pages of comments are now at both the top and bottom of a thread (aka a post):
"« First « Previous Comments 15-54 of 54 Last »"
Thank you!
Meh, you blow up any semblance of Free Speech forum by conveying that kind of power to any one special snowflake. Personally, I'd fade anything I gleaned from anyone even loosely associated with Facebook.
And just because some pussy is afraid of what I have to say, doesn't automatically mean that I couldn't benefit from something that they have to say. Besides, with your approach, it's quite obvious that anyone so infantile as to engage in that crap here, can work up a two second work around. And your left with a less than working resolution.
No special snowflake has any more power than any other user. Well, I myself can edit whatever, but unlike the Reddit CEO I don't do that.
Ignores are only person-to-person. The important thing is that no one user A has the power to block user B from talking to everyone else.
Yes, you can still view the comments of someone who has you on ignore by logging out, and you could reply by creating a new identity with a new email, but that is inconvenient. And that's the point. I'm just trying to slow down bad interactions, like the lead bricks they use in nuclear power plants to prevent the thing from blowing up.
This is not completely true if the ignored cannot see or post to the ignorer's thread.
Maybe ignore should just apply to individual comments and not thread viewing/commenting?
Ignores are only person-to-person. The important thing is that no one user A has the power to block user B from talking to everyone else.
@patrick
Doesn't seem to work at all. A thread pops to the top of the front page, I click either the icon that marks a new comment has been made, or the hyperlink to the last comment, and I'm directed to the Original Post. Tsk tsk, Patrick. Your "solution" is stifling Free Speech.
I'm forced to censor what I post, or risk offending some feeble minded sissy, because you empower other users to alter my experience.
If you want to enable the Anti-American Free Speech haters to have a Safe Space, you're going to lose more users. If you must assist those too weak to simply scroll past the comments of someone they do not like, then keep the ignore feature simple: if User A ignores User B, then User B comments are hidden from User A view. The End
No stupid suggestions from some Facebook loser about empowering the anti-free speech crowd, to adversely affect those of us who value Free Speech.
Or, have all comments posted anonymously. No user names, No icons. Just words and thoughts. That way, there is no chance of those from the shallow end of the gene pool, cluttering the forum with personal attacks.
Isn't that kinda what reddit does?
Thought there was value in seeing debate between iwog/others vs. a bit more of a one sided discussion these days.
Totally get it- if you don't like what you're reading go elsewhere. And I'm aware of Patrick's offer to stand up other message boards for people.
My macro point (reason for the post) was that a DM feature might be nice to have on patnet, as if there were I would have just messaged a few people who have been here a while to ask: a.) Do you know if iwog still posts anywhere, would be interested in reading a blog/etc. Or b.) Where else do you go for debate / conversation.
He had an issue with posts being moderated and bailed on the site
« First « Previous Comments 23 - 62 of 510 Next » Last » Search these comments
patrick.net
An Antidote to Corporate Media
1,251,725 comments by 14,930 users - Ceffer, goofus online now