TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: What research questions do you want answered?

1 pointsby qwertygnuover 3 years ago
Feel free to be as impractical, inane, and&#x2F;or idealistic as you wish. What have you been curious about and thought, &quot;I wonder what would happen if you ran a study on this...&quot;<p>Here&#x27;s a couple examples:<p>- What numbers are people least&#x2F;most likely to notice missing in an ordered list of numbers? E.g. If you ask people to quickly scan the list [60,61,62,63,64,65,66,67,68,69], which element when taken out would have the highest&#x2F;lowest % of people who respond that there are no missing elements? Is it numbers that end in 7? Maybe people always notice when numbers with repeated digits are missed (66).<p>-

1 comment

dalkeover 3 years ago
The Jaccard index between two bit-vectors A and B is<p><pre><code> ||A intersection B|| &#x2F; ||A union B|| = popcount(A &amp; B) &#x2F; popcount(A | B) </code></pre> Let N be the number of bits in the bit-vector.<p>Consider A_1, A_2, A_3, ... A_i which are distinct bit-vectors in the 2^N-1 possibilities. (Clearly the all-zero bit vector is not a solution since it will have an index of 0.0 with all other bit-vectors.)<p>For each N, what is the largest i such that all pair-wise Jaccard index values are distinct? And what is an example?<p>This comes up in testing algorithms in my field of cheminformatics. The bit-vectors are &quot;molecular fingerprints&quot;, and my field refers to the Jaccard index as &quot;the Tanimoto score&quot;.<p>If there are ties in the scores then different implementations of the same algorithm break ties arbitrarily.<p>For cross testing, I want a test set which has no ties. All implementations should give the same results.<p>This is not urgent. We were able to come up with a random sampling method which found 27 bit vectors for N=64 bits. (I also tried a GA and a Z3 search, but those weren&#x27;t as effective.)<p><pre><code> 010ca5212bb73967 id1 222257801a012513 id2 0005683000080038 id3 efffff7fffffffff id4 fbfbdfffedfbdeff id5 21602a5034b064a0 id6 7d01cc7d85503009 id7 e7ff2efbfa723f4d id8 dfe9bcffddffefff id9 11eaf6836d1a1b13 id10 356c55eff7decef9 id11 b6bfdfda73f7faff id12 0200000000002122 id13 dfdd7fdf7f7dfeff id14 12449480c4c7cb1c id15 67efdeff75fbbe4a id16 d0ae48211cc31910 id17 77e23fb4dfd848b1 id18 53c4fffbe5fbf4fb id19 b51f9cbffede9f10 id20 0100200100000281 id21 f7f60d895cc39d5f id22 f997c59d6c75067a id23 eeffffffffffdf7f id24 0003d010581e0813 id25 914d85e6e9f9566b id26 9d6f3f77b74a779c id27 </code></pre> Still, I would like to know the maximum value for i.<p>I brute-forced it in C and found values up to N=8 or so. Starting from N=2: max i = 2, 3, 4, 5, 5, 6, 7. An example solution for N=8 is:<p><pre><code> 01 id1 0e id2 17 id3 37 id4 77 id5 7f id6 ff id7</code></pre>