0

I was working on some data and I had to assign a big number to some values in a DataFrame. Then I tried reading these values but surprisingly they changed. I know for a fact it's not a printing display problem but it's something different. Here is what i got as an example:

x = 410121209151013.6360
print("%.5f" % x)

And this is what I get :

410121209151013.62500

I made some tests and found out that there is some sort of a digit limitation but don't know how to fix it. Any help is much appreciated.

  • 4
    Does this answer your question? [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – ti7 May 20 '21 at 15:22
  • 1
    [What Every Computer Scientist Should Know About Floating-Point Arithmetic](//docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) – Pranav Hosangadi May 20 '21 at 15:23
  • Use [Decimal](https://docs.python.org/3/library/decimal.html) if you want to have accurate results. You have to pass the number as string, otherwise you loose precision before it's passed to the Decimal object. – Nearoo May 20 '21 at 15:28

1 Answers1

0

Floating-point math is fraught with peril related to how the numbers are stored (see comment on Question)

Whenever you can work in an integer space, try to do so, and then represent the numbers as you see fit (for example, you could multiply 'em by 1000000 and convert the units to milli-whatevers from Mega-whatevers)

ti7
  • 16,375
  • 6
  • 40
  • 68
  • Thank you for your fast reply! I already tried it with integer, still the same problem : 4101212091510136360 --> 4101212091510136320 41012120915101363601 --> 41012120915101360128 – Jihad Tannoury May 20 '21 at 15:32
  • Anytime - alas, the problem actually happens right when the system brings the numbers in and how they're really saved (you could represent them in binary to view how they're really being stored) - however, take some comfort in that you probably don't need so much accuracy, even if it appears to be available! – ti7 May 20 '21 at 15:38
  • Well, I'm using a rolling apply custom function on a dataframe, this allows me to return only one value. I need more than one so I combined all info I need in one crypted number, that's why i need all this accuracy. I can't return int because I have some nans in the data so I use float64. I can fix my problem by running a rolling apply 5 times to return each time one parameter but my code will take 5 times more time. I guess I'll have to find a work around. – Jihad Tannoury May 20 '21 at 16:12
  • brutal - just filter out the nans! [`~isna()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isna.html) Simpllfying this problem may make a good question, however if it's something you can share! – ti7 May 20 '21 at 16:37