30 points

What the heck is real life number

permalink
report
reply
21 points

An actual measured data point, as opposed to a randomly generated number. Also this principle applies specifically to the first digit. Overall the title is a complete mess.

Basically, when you gather a bunch of data points about real world quantitative phenomena (e.g. town population, lake surface area, etc), you find this distribution curve of leading digits where 1 is something like 30% most frequent, gradually decreasing down to 9 being least frequent.

This is called Benford’s Law, it’s basically an emergent property about how orders of magnitude work. It’s useful because you can use it to detect fake data, since if your data faker doesn’t know about it they’ll generate fake data that looks random but doesn’t follow this distribution.

permalink
report
parent
reply
3 points

something that isn’t an imaginary life number

permalink
report
parent
reply
30 points

Great video on Benford’s Law here. Matt goes into a good amount of detail outlining why this occurs, why it doesn’t always apply, and what it means if data does/doesn’t follow the Law.

permalink
report
reply
1 point

Neat. Thanks for the share.

permalink
report
parent
reply
9 points

Here is an alternative Piped link(s): https://piped.video/etx0k1nLn78

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source, check me out at GitHub.

permalink
report
parent
reply
5 points

Does anybody know if this is a feature of a decimal system?

permalink
report
reply
3 points

The distribution shown in this post is for base 10, but Benford’s Law includes distributions for other bases too. The wiki article linked in another comment goes into detail on that too.

permalink
report
parent
reply
2 points

The percentages change. At the lower end, in binary every number that isn’t 0 itself starts with a 1.

This fact is actually used to save one bit in the format that computers usually use to store floating point (fractional instead of integer) numbers.

permalink
report
parent
reply
2 points

If you were in Base 12 or something it would still lean towards 1 but the percentage would be a little different.

permalink
report
parent
reply
13 points

I think it’s a feature of all positional notation systems.

permalink
report
parent
reply
10 points

This is a bit weird. I was just listening to Infinity 2 today (great book. Totally recommend), and there’s a section where the characters use Benford’s Law to prove reality. I then had to look it up myself.

Just a super weird coincidence…unless Lemmy is listening to me…

permalink
report
reply
8 points
*

We are not listening to you Travis.

That had a 1 in a million chance, but I had to try.

permalink
report
parent
reply
5 points

It was worth the shot if you ask me, Michael

permalink
report
parent
reply
6 points

This is called the Baader–Meinhof phenomenon, or frequency illusion.

permalink
report
parent
reply
5 points

So if I rolled a 10 sided dice 1000 times 30% of those rolls would be a 1?

permalink
report
reply
22 points

No

permalink
report
parent
reply
24 points

Thanks. Now I understand

permalink
report
parent
reply
7 points

From what I understand it works like this.

Let’s say you have a series of numbers that represent real life data. In general the first number of all of these numbers will be a 1, 30% of the time.

permalink
report
parent
reply
1 point

No it is a property of real life thing. It come from the fact that most thing in real world, dont go over 30 or 300 so often. Like number of houses in a street.

permalink
report
parent
reply
6 points
*

It works on things that operate on a logarithmic scale. It’s odd how many real-world things fit that mold that don’t intuitively seem like they would.

Another factor promoting it in real-world data sets is that they often have restricted ranges that favor lower numbers. Days of the month, for example, only go from 1 to 31. There’s only one way for the leading digit to be 4, but there are eleven ways for the leading digit to be 1.

Another type of data includes values of varying ranges, which also favors lower leading numbers. Street numbers start at 1 and go up, ending at some point within a fairly large range in the real world. All of these ranges will have their fair share of leading 1s. They will NOT all have a fair share of leading 2s (what if it ended before 20?), and as you go up it gets progressively less likely. So if you took all street addresses, you’d expect to see more leading 1s than 9s.

Your theoretical dice roll is not such a case. You would expect a uniform distribution of leading numbers. This would hold true with a 99-sided die as well.

permalink
report
parent
reply
3 points

While that’s true with a 10-sided die 20% of your rolls will start with a one and all other digits only have a 10% chance.

permalink
report
parent
reply
2 points

Oh, yes. Thanks for the correction!

permalink
report
parent
reply

Today I Learned

!til@lemmy.world

Create post

What did you learn today? Share it with us!

We learn something new every day. This is a community dedicated to informing each other and helping to spread knowledge.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must begin with TIL. Linking to a source of info is optional, but highly recommended as it helps to spark discussion.

** Posts must be about an actual fact that you have learned, but it doesn’t matter if you learned it today. See Rule 6 for all exceptions.**



Rule 2- Your post subject cannot be illegal or NSFW material.

Your post subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That’s it.



Rule 5- No baiting or sealioning or promoting an agenda.

Posts and comments which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding non-TIL posts.

Provided it is about the community itself, you may post non-TIL posts using the [META] tag on your post title.



Rule 7- You can't harass or disturb other members.

If you vocally harass or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.

For further explanation, clarification and feedback about this rule, you may follow this link.



Rule 8- All comments should try to stay relevant to their parent content.

Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.

Unless included in our Whitelist for Bots, your bot will not be allowed to participate in this community. To have your bot whitelisted, please contact the moderators for a short review.



Partnered Communities

You can view our partnered communities list by following this link. To partner with our community and be included, you are free to message the moderators or comment on a pinned post.

Community Moderation

For inquiry on becoming a moderator of this community, you may comment on the pinned post of the time, or simply shoot a message to the current moderators.

Community stats

  • 7.2K

    Monthly active users

  • 697

    Posts

  • 18K

    Comments