charlieegan3 220 days ago [-]
This is a great idea for a project, the 'users who also posted' metric seems to have worked really well.

The site seems to fail to load the 'hot' items for the subreddits when I click on them but that's not a big deal for me. On closer inspection, it doesn't seem to be making any requests. Just says `Failed to download https://www.reddit.com/r/thinkpad/hot.json` etc

aasasd 218 days ago [-]
> the 'users who also posted' metric

— Hello, is this the anime channel?

— Yes.

— How do I patch KDE2 under FreeBSD?

(https://en.wikipedia.org/wiki/How_does_one_patch_KDE2_under_...)

Smithalicious 218 days ago [-]
The accuracy of this meme is stunning. I run an anime-related discord of 20-odd people and at least half of people there work in tech in some way. I've seen similar things in order such communities.

I wonder if this is just a cultural artifact from the time that anime and technology were both "geeky" niche interests (to a greater extent than they are now) or if there's a deeper underlying reason...

vidarh 217 days ago [-]
It may be a stereotype, but to me it seems that in geek circles it is much more acceptable to admit to continuing to appreciate things often seen as "childish" elsewhere in general.
mrfusion 217 days ago [-]
Another metric could be looking at cross posts. I’m not sure which is better.
Breza 209 days ago [-]
That could be cool, but it would eliminate any subs that don't allow crossposting. That includes a few of the heavy hitters like ShowerThoughts and AskReddit.
anvaka 218 days ago [-]
hmmm... I don't see the error on my end. What browser do you use? Can you try in "incognito" mode? Are there any extensions that might be blocking this?
Sendotsh 218 days ago [-]
Doesn't work for me either, on Firefox Developer Edition 65.0b10 (64-bit) with no extensions enabled (disabled them all to double-check it wasn't one of them blocking it).

Works fine in Edge.

It's purely the loading of the Hot sidebar, everything else works fine. It has already helped me find a few new subs I didn't know existed, so thanks!

newman314 217 days ago [-]
Have you thought about building something similar for bot identification on Twitter? I suspect that would be quite the useful feature.
charlieegan3 218 days ago [-]
Yeah I'm on FF - can confirm my issue was related to the content blocking.
jsloss 218 days ago [-]
I'm getting the same error on Firefox Quantum.
anvaka 218 days ago [-]
Hm... I'm at lost.

https://jsbin.com/fuyijan/2/edit?js,console - this works in Chrome, and non-private mode of Firefox Quantum (64.0.2 (64-bit)). However when I open private browsing in Firefox Quantum request fails.

Anyone might know why?

EvilTerran 218 days ago [-]
That sounds like Content Blocking kicking in - that's only active in Private Browsing by default: https://support.mozilla.org/en-US/kb/content-blocking

I note that page says "By default, content blocking uses the Disconnect.me basic protection list" - and reddit.com is on that list: https://github.com/disconnectme/disconnect-tracking-protecti...

(I'm guessing reddit's "social button" is considered a tracker.)

[edit] confirmed, it's definitely Content Blocking: I just loaded that jsbin in an FF private window, and there's a message in the console to that effect.

anvaka 218 days ago [-]
Thank you so much! I opened an issue here: https://github.com/disconnectme/disconnect-tracking-protecti...
renholder 218 days ago [-]
>(I'm guessing reddit's "social button" is considered a tracker.)

It wouldn't surprise me. Even though /r/ has concepts like "Silver" and "Gold" to generate revenue, I think it's main driver is still advertising; so, for it to behave like Facebook, Google, etc. wouldn't be that much of a stretch of the imagination. (Or maybe I'm just far too paranoid?)

Smithalicious 218 days ago [-]
It's a cool tool, but it seems very biased towards bigger subs. If you let it loose on a small sub it will emphasize that big, kinda-but-not-really related subs over tiny-but-closely-related subs.
ppod 218 days ago [-]
Using Jaccard has this effect, mutual information would correct more for the independent frequency of the posts per subreddit.
Smithalicious 217 days ago [-]
It's a shame since this tool would be particularly useful for recommending small subs. I don't need it to tell me about big subs, since I already know them.
patcon 217 days ago [-]
This is seriously amazing man! Interesting to see how different subject-areas network themselves differently.

For example, comparing "r/permaculture" to "r/linux".

Also, looking at r/girlgamers makes me realize my privilege for being able to navigate my interest areas without such a clusterfuck of bullshit going on: https://anvaka.github.io/sayit/?query=girlgamers

swampthinker 217 days ago [-]
It's really sad how toxic Reddit brigading is
skilled 218 days ago [-]
This is awesome! My input had exactly the results I expected.

Thanks for creating this tool, bookmarking!

anvaka 218 days ago [-]
Thank you! I'm very glad you liked it :)
viraptor 218 days ago [-]
I checked VXjunkies and found the level of weirdness I haven't expected. Will need a few hours to browse through this while nobody is around / can be startled by sudden, random laughter...
nairboon 220 days ago [-]
That's a cool tool. And useful extension would be if it preserves the location history if you navigate topics, so that you can go back.
anvaka 218 days ago [-]
Good call. I was worried that I'd "spam" the browser history and people who are coming from reddit or HN would never go back to where they came from :)
adrianmonk 217 days ago [-]
Usability improvement idea: make it easier to discover how to re-center the graph around a new subreddit.

I spent several minutes playing around with this, and I was just typing in the name of the desired subreddit because that was the only I could figure out. Finally, after much experimenting, I realized double-clicking is the solution.

Oh, and a second, related usability idea: if I double-click, don't open the preview sidebar at the right. I can see how the sidebar is useful, but if I'm doing one action, I don't want it to have two effects. Also, I have signaled clear intent to browse the graph, so I want more screen real estate to be devoted to that.

EDIT: bonus usability idea/request: clicking on a node brings up the preview sidebar. It'd be nice if clicking on it again (not double-clicking) makes the sidebar hide again.

KasianFranks 218 days ago [-]
Anvaka, when you accept BTC or ETH let us know, we can contribute to your efforts.
anvaka 218 days ago [-]
Thank you, Kasian!
hueyjj 218 days ago [-]
> The relationship is determined by a metric "users who posted to this subreddit also post to...".

I'm interested, could you share with us the the entire metric you used to determine the relationship?

anvaka 218 days ago [-]
jcims 217 days ago [-]
Have you tried polling profiles to see how many are sharing upvotes/downvotes? It used to be a small percentage but is pretty informative.
minimaxir 218 days ago [-]
You indicated that you used the Pushshift.io datasets, but how did you compute Jaccard Similarity on a dataset of 38M?
anvaka 218 days ago [-]
I didn't use pushshift, sorry. The data was collected from bigquery, stored locally into CSV files, and then I just wrote a node.js script to compute similarities.
Scaevolus 218 days ago [-]
Did you simply collect "user has posted to X, Y, and Z subreddits", or did you look at frequency too?
minimaxir 217 days ago [-]
The reason I asked the question is because back in 2016 I had a similar (now out of date) approach to finding related subreddits at scale using Jaccard similarity: https://minimaxir.com/2016/06/reddit-related-subreddits/

There, I only built a user edge if a given user commented on 5 distinct threads in a subreddit, since a lot of subreddit interaction was due to brigading.

anvaka 218 days ago [-]
I didn't look into frequency. Is there a version of jaccard similarity that accounts for frequencies?
yorwba 217 days ago [-]
scrollbar 218 days ago [-]
Check out Graphlab Create's recommender toolkit, pretty fast for sets of that size

https://turi.com/products/create/docs/graphlab.toolkits.reco...

Smithalicious 218 days ago [-]
+1 for this recommendation, but it's called turicreate now and can be found here: https://github.com/apple/turicreate
jotato 218 days ago [-]
*types in DunderMifflin

related: MapsWithoutSouthSudan

I know what I am going to be doing for the next 30 minutes

bibyte 218 days ago [-]
This is a really useful tool. It works so smoothly on my mobile.
anvaka 218 days ago [-]
Happy to hear :)!
Phenomenit 218 days ago [-]
Great,

I've been searching for a tool like this for ages, bookmarked!

anvaka 218 days ago [-]
Thanks :)
laurynas-s 218 days ago [-]
This is really nice!
anvaka 218 days ago [-]
Thanks :)!
techaddict009 218 days ago [-]
Good tool if possible add option to view result data in tabular format with no of subscribers. As this way its difficult to use.
DevX101 218 days ago [-]
Great tool! This site supports my suspicions that much of the activity on /r/The_Donald is the coordinated effort of a few individuals posting across multiple accounts. For those not familiar with this sub, it was created sometime during the 2016 election leadup and unabashedly supports Donald Trump with memes and shitposting. At one point, the entire frontpage of reddit was just posts from /r/The_Donald until reddit admins had to alter their algorithm to force the sub off.

If you look at the network graph for /r/The_Donald, it doesn't look...organic. There are 4 clearly delineated clusters of sub related to that sub. Posters to /r/The_Donald heavily post to /r/news & /r/politics, /r/TropicalWeather (?), /r/TwoXChromosomes (?) and /r/AskTheDonald (and other alt-right subs).

There's not much interaction with the rest of reddit. Posters from other subs don't also post content to the /r/The_Donald.

This is unusual.

Every other sub I've looked at there's a much more complex & dynamic graph where users post across various communities across the site. Every other major sub looks like a real network with dozens of interconnected links. Yet, /r/The_Donald, with almost 700,000 subscribers only has a strong connection to 4 clusters.

The alternate hypothesis is that people on that sub heavily use alternate accounts. This might also explain the lack of interaction with the site compared to other subs of similar size.

zawerf 218 days ago [-]
DevX101 218 days ago [-]
Thanks! That's probably it then. I guess this doesn't support my hypothesis after all.
bdibs 218 days ago [-]
This is great, and works flawlessly!
anvaka 218 days ago [-]
Thank you! I'm so happy you like it.
bdibs 218 days ago [-]
It’s simple and just works, don’t stop making great things.
anvaka 218 days ago [-]
Aww, thank you!

> don’t stop making great things.

Not going to ever stop! I have sooo many ideas - I wish I could be more efficient :).

cannedslime 217 days ago [-]
Useful little tool! Reddit humor subs are so damn specific, it can be hard to find them all.
mrfusion 217 days ago [-]
Is this only for tech subjects or am I using it wrong?

Edit. Somehow I missed the big searchbar at the top.

cambaceres 217 days ago [-]
I tried "tits", that worked.
criddell 217 days ago [-]
Ornithologist?
cambaceres 217 days ago [-]
There was some cocks present anyway
jamiek88 217 days ago [-]
Fantastic! I tested the heck out of this and found it really useful.

Already found some cool subs.

belltaco 217 days ago [-]
You should submit this to r/dataisbeautiful if not already done.
ppod 218 days ago [-]
Which javascript network vis library does this use? It's very nice.
yanslookup 218 days ago [-]
I was sort of expecting to be able to click through to the subreddit...
benibraz 218 days ago [-]
Very nice tool, thank you very much that. This is why is love HN
kerbalspacepro 217 days ago [-]
Interesting finds:

* /r/askscience is nested at the center of defaults (I think a lot of older, famous subs will end up highly connected)

* /r/relationship_advice is kind of a loner. The graph generates six distinct subreddit clusters- feminism, lgbt-issues, counseling, and misc. science fields. The last cluster is a very large, diffuse cluster of sex/porn/depression subreddits that skew towards defaults.

* /r/slatestarcodex has distinct clusters too. 1) Effective altruism and philosophy, 2) Psychiatry, 3) Rational fiction writing, 4)Liberal-tarian, IDW defaults, 5) "Classic effort post" subs like true_reddit and depth_hub.

* /r/bigboye is a tiny part of a very large network of animal gifs subreddits. /r/animalsbeingbros connects it to a bunch of high volume gif subs.

newman314 217 days ago [-]
* /r/the_donald has a surprising link to /r/TwoXChromosomes [https://anvaka.github.io/sayit/?query=the_donald]

* /r/politics seems to have higher interconnection

* /r/awww is quite wholesome =) [https://anvaka.github.io/sayit/?query=Awww]

* /r/puppers has some strange nsfw links

belltaco 217 days ago [-]
>/r/the_donald has a surprising link to /r/TwoXChromosomes

I don't think it's surprising. Donald fans on social media tend to hate minorities and women, not surprised they would try to brigade women oriented subs.

It got so bad that subs like /r/offmychest automatically ban people that post in many alt right related subreddits.

patcon 217 days ago [-]
The isolatedness of /r/relationship_advice might have to do with OP's being from throwaways?
chad_strategic 220 days ago [-]
This is great!

But on a side note, I can also waste more time on the Internets!

diziet 218 days ago [-]
Is this built on top of your work on yasiv before?
anvaka 218 days ago [-]
It would be fair to say so. The core layout is the same with a bit more polished overlap removal and animation.
myself248 217 days ago [-]
Why do I get stuck in "dead ends"? For instance, https://anvaka.github.io/sayit/?query=rtlsdr contains https://anvaka.github.io/sayit/?query=PlutoSDR but the inverse is not true -- once I'm in PlutoSDR there's only one other subreddit and the two of them are an island.
andyidsinga 217 days ago [-]
damn - i wondering if this with marketing in order to find out where your audience hangs out.
sureaboutthis 218 days ago [-]
Ya' know this assumes one would use reddit as a reference for learning which one should never, EVER do, don't ya?
thro_a_way 218 days ago [-]
hi thanks for this. Is there a guide to how you are storing the data on github pages?
amunategui 218 days ago [-]
Great visualization! Nice work.
chx 218 days ago [-]
Incredibly useful, thanks!
diimdeep 217 days ago [-]
Amazing!
flylib 218 days ago [-]
nice tool
yzb 218 days ago [-]
Would be nice if banned subs appeared in a different colour.
patcon 217 days ago [-]
If you have spacetime, you might consider sharing this with LGBTQ and kink communities experiencing the Tumblr diaspora.

Context: https://nowtoronto.com/lifestyle/advice/savage-love-tumblr-p...

Lots of people feel uprooted from sex-positive and/or tightly-bound communities they've been part of for years, and don't know how to rediscover or rebuild the healthy networks they've lost on Tumblr. I know full-grown adult women who are struggling to find footing again in the most personal of spaces.