You are not logged in.
From this problem sheet (warning: links to an image):
Consider a function which, for a given whole number n, returns the number of ones required when writing out all numbers between 0 and n. For example, f(13) = 6. Notice that f(1) = 1. What is the next largest n such that f(n) = n?
I thought that this was an interesting problem I get 199,981 for the next such n, followed by 1,599,990.
I implemented the problem in (ugly) C and (sweet) OCaml, and was pleasantly surprised to find that the ML version was faster than C. Who has a nicer/faster C implementation than mine, or an implementation in your favourite functional language that beats out my ML implementation?
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define MAX_DIGITS 10
int main(){
unsigned n = 2;
while(1){ // For each n
unsigned ones_up_to_n = 0;
unsigned curr;
// For current n, curr gets each int up to n
// Then we find the number of 1s in curr, and add them onto
// `ones_up_to_n'.
for(curr = 0; curr <= n; curr++){
// Make array with digits of number
// i.e. 123 => [0,0,0,0,0,0,0,0,1,2,3]
unsigned str[MAX_DIGITS] = {0};
long denom = (long)pow(10, MAX_DIGITS - 1);
unsigned temp_n = curr;
unsigned i;
for(i = 0; i < MAX_DIGITS; i++){
unsigned div = temp_n / denom;
temp_n -= (div * denom);
str[i] = div;
denom /= 10;
}
// Count 1s
for(i = 0; i < MAX_DIGITS; i++)
if(str[i] == 1) ones_up_to_n++;
}
if(n == ones_up_to_n){
printf("\rFound n: %d\n", n);
exit(0);
}
n += 1;
}
return 0;
open Printf
let ones_in n =
let str = string_of_int n in
let len = String.length str in
let rec ones_in str idx total =
if idx = len
then total
else begin
match str.[idx] with
| '1' -> ones_in str (idx + 1) (total + 1)
| _ -> ones_in str (idx + 1) total
end in
ones_in str 0 0
let ones_up_to n =
let rec ones_up_to n total =
match n with
| 0 -> total
| m -> ones_up_to (m - 1) (total + (ones_in m)) in
ones_up_to n 0
let _ =
let n = ref 2 in
while true do
let ones = ones_up_to !n in
if ones = !n
then (printf "\r%15d\n%!" !n; exit 0)
else n := (!n + 1)
done;
What I found interesting is how f(x) varies with x. f(x) stays lower than x (as you might expect) until it equalises at 199,981. However, the slope of f(x) / x becomes much steeper when x is a number beginning with 1, as you can see from the below graph of f(x) over x...the green line is y = x, and the red line is y = f(x).
Note the sudden increase in steepness at x=10,000, and also at x=100,000. In addition, if the first digit of x is not a 1, but the second digit is, then you can see little 'jumps' in the steepness of the graph.
Last edited by karkhaz (2014-08-08 21:34:21)
Offline
I thought that this was an interesting problem I get 199,981 for the next such n, followed by 1,599,990.
Hrmm, but 199,982 uses one more 1, and thus is a candidate. And, all the numbers between 199,982 and 200,001 are candidates too. This seems to be a common occurrence for larger values of N, too.
Here's my C contribution:
#include <stdio.h>
int f(int n) {
int ones = 0;
while (n) {
if (n % 10 == 1)
++ones;
n /= 10;
}
return ones;
}
int main(void) {
int n = 0, total = 0;
for (;;) {
++n;
total += f(n);;
if (n == total)
printf("fn(%d) = %d\n", n, total);
}
return 0;
}
Compiled at -O3, this finds 117463825 as a candidate in 3.7s (3.3ghz CPU)
Last edited by falconindy (2014-08-08 22:38:26)
Offline
Hrmm, but 199,982 uses one more 1, and thus is a candidate. And, all the numbers between 199,982 and 200,001 are candidates too. This seems to be a common occurrence for larger values of N, too.
You're right. What I meant was that, given that these `blocks' of candidates occur, the next block starts at 1,599,981.
Here's my C contribution:
*...slow clap* Way way faster than my solution, thanks! And simpler too.
Offline
I'm not at my main machine, but there is a faster way to calculate f(n) for large n rather than summing it up as you go. Consider the pattern
f(9) = 1
f(99) = 10 + 10
f(999) = 100 + 100 + 100
...
which can be proven to extrapolate to any number of nines, and consider how to calculate separately the number of ones that are in the ones' place, those that are in the tens' place, those that are in the hundreds' place, and so on, without having to count them individually. I would post my proof of concept code but it's just way too ugly and I'm dealing with a weird keyboard layout (qwerty) so I'm not willing to try to fix it now.
Granted, if you have to walk over all the positive integers anyway it's probably cheaper to use an accumulator like falconindy's solution does.
Offline
Granted, if you have to walk over all the positive integers anyway it's probably cheaper to use an accumulator like falconindy's solution does.
Indeed, but your observation is useful if you want to calculate f(n) for a particular n without finding all the previous ones first. Here's my code based on your observation:
#include <math.h>
long ones_in(long n){
long ones = 0;
while(n){
if(n % 10 == 1) ++ones;
n /= 10;
}
return ones;
}
long f(long n){
unsigned digits = 0;
long temp_n = n;
while(temp_n){
digits++;
temp_n /= 10;
}
// The largest 100...000 smaller than n
long pow_ten = (long)pow(10, digits - 1);
// f(the largest 999...999 smaller than n)
long ones = (digits - 1) * (pow_ten / 10);
// Then calculate the remainder of the ones from pow_ten upwards
long i;
for(i = pow_ten; i <= n; i++){
ones += ones_in(i);
}
return ones;
}
This code calculates f(1000000000000000000) instantly. I can't go much higher than that number, the program spins for some reason (limits on the size of long?).
Edit: Also, here's a new ML implementation based on falconindy's idea. It's about as fast as falconindy's C code on my machine.
open Printf
let ones_in n =
let rec ones_in n total =
match n with
| 0 -> total
| _ -> begin match n mod 10 with
| 1 -> ones_in (n / 10) (total + 1)
| _ -> ones_in (n / 10) total
end in
ones_in n 0
let _ =
let n = ref 1 in
let ones = ref 0 in
while true do
ones := !ones + (ones_in !n);
if !ones = !n
then printf "%12d\n%!" !n
else ();
n := (!n + 1)
done;
Last edited by karkhaz (2014-08-18 00:44:21)
Offline
Here's a Haskell version. It could be a little cleaner, but it's not bad. It runs at ~30 seconds up to 1111111110 (compiled with -O2 -fllvm). To put that in perspective, falconindy's posted version (32 bit ints) runs ~22 seconds here. Using longs (64 bit ints) with his code, it's runs about ~26 seconds. When using arbitrary length integers on my Haskell version -- which is the problem domain -- things come out to ~360 seconds*:
import Control.Monad
ones :: Int -> Int -> Int
ones n acc
| n == 0 = acc
| n /= 0 = if digit == 1
then ones next (acc + 1)
else ones next acc
where
(next, digit) = n `quotRem` 10
matches :: Int -> Int -> [Int]
matches n a
| n == acc = n : matches (n + 1) acc
| n /= acc = matches (n + 1) acc
where
acc = a + ones n 0
main :: IO ()
main = mapM_ (\x -> print x) $ matches 1 0
* Integer division for arbitrary length integers is a major bottleneck in Haskell (at least).
**EDIT**
I had to know that if for arbitrary length integers if it was simply faster to convert the numbers to strings and count the 1s. It is. This blob runs at ~315 seconds. Admittedly, it will fail with the first number with 9223372036854775808 1's in it as 'length' only returns 64 bit ints.
import Control.Monad
ones :: Integer -> Integer -> [String]
ones n a
| n == acc = enn : ones (n + 1) acc
| n /= acc = ones (n + 1) acc
where
enn = show n
acc = a + (fromIntegral $ length $ filter (== '1') $ enn)
main :: IO ()
main = mapM_ (\x -> putStrLn x) $ ones 1 0
Last edited by skottish (2014-08-16 22:25:50)
Offline
Integer division for arbitrary length integers is a major bottleneck in Haskell (at least).
So consider switching to Caml Here's my arbitrary-length integer version, and it gets to 1111111110 in about 38 seconds. Your second Haskell program (using strings) took about 340s on my machine.
open Printf
open Big_int
let ten = big_int_of_int 10
let one = unit_big_int
let zero = zero_big_int
let ones_in n =
let rec ones_in n total =
if eq_big_int n zero
then total
else let (quot, rem) = quomod_big_int n ten in
if eq_big_int rem one
then ones_in quot (succ_big_int total)
else ones_in quot total in
ones_in n zero
let _ =
let rec f_all n ones_so_far =
let new_ones = add_big_int ones_so_far (ones_in n) in
begin
if eq_big_int new_ones n
then printf "%d\n%!" (int_of_big_int n)
else ();
f_all (succ_big_int n) new_ones
end in
f_all zero zero
My rubbish first attempt in the first post used strings as well, but it was slower than your version in Haskell. I still find OCaml simpler to read then Haskell though, and I need to learn Haskell at some point soon, yuck.
Last edited by karkhaz (2014-08-17 17:58:05)
Offline
@karkhaz,
What compiler options are you using to compile your OCaml code? With just ocamlopt, your regular int version is running over four minutes on my end.
OCaml seems like a pretty cool language. I wouldn't mind messing around with it so more.
My Haskell is still pretty novice. I'm going to try to get this one more performant.
**EDIT**
Just for fun, this is a still slow, but totally funtioning Clojure version:
(ns cl.core
(:gen-class)
(:require [clojure.java.io :as io :only (reader)]))
(defn ones
([n] (ones n 0))
([n acc]
(let [digit (rem n 10)
next-num (quot n 10)]
(if (== n 0)
acc
(if (== 1 digit)
(recur next-num (inc acc))
(recur next-num acc))))))
(defn matches
([] (matches 1 0))
([n a]
(let [acc (+ (ones n) a)]
(if (== n acc)
(println n))
(recur (inc n) acc))))
(defn -main
[]
(matches))
Last edited by skottish (2014-08-17 19:36:44)
Offline
Oh wow, eating humble pie at the moment. When I thought I was running my arbitrary-int Caml code, I was actually running your regular-int Haskell code. *facepalm* That's why it was so fast, I thought that couldn't be right.
Here's the scoreboard so far:
Time to reach 1111111110
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
falconindy C: 00m 31s
skottish regular-int Haskell: 00m 36s
karkhaz regular int Ocaml: 03m 35s
skottish big-int Haskell: 04m 46s
karkhaz big-int Ocaml: 43m 57s
I'm relatively new to ML. AFAIK, there are no optimisation flags that you can pass to ocamlopt. If you want to build my big-int version, you'll need to link in the nums library: ocamlopt -o ml_big_int nums.cmxa ml_big_int.ml.
I can't figure out how to run the Clojure code (never used Clojure before). I did
lein new cloj
cd cloj
Then copied your code into src/cloj/core.clj, and then started lein repl in the cloj directory and typed (-main), but it is "Unable to resolve symbol". The code looks cute though.
Based on trent's observation, I wonder if this problem could be fruitfully multi-threaded. Given that we can trivially calculate the answer for 10, 100, 1000,... then it might be worth spawning some threads with those numbers as their 'starting point' and then each thread carries on from there. I'll do an implementation at some point, but I bet it would be cute if done in Erlang or some other concurrency-oriented language.
Offline
I can't figure out how to run the Clojure code (never used Clojure before). I did
lein new cloj cd cloj
Then copied your code into src/cloj/core.clj, and then started lein repl in the cloj directory and typed (-main), but it is "Unable to resolve symbol". The code looks cute though.
On the top of my core.clj, change the namespace line to:
(ns cloj.core
and see if that helps. If everything runs, go into the project directory and use:
lein uberjar
The version that will run will be in ./target/uberjar and is the "standalone" one. Run it with "java -jar". If the uberjar sub-directory isn't there, then your version of leiningen is getting old. If that's the case, then the jars will be simply be under "target" (I think). I am running openjdk8 with Clojure 1.6.
Based on trent's observation, I wonder if this problem could be fruitfully multi-threaded. Given that we can trivially calculate the answer for 10, 100, 1000,... then it might be worth spawning some threads with those numbers as their 'starting point' and then each thread carries on from there. I'll do an implementation at some point, but I bet it would be cute if done in Erlang or some other concurrency-oriented language.
If I understand what's happening with the code that you wrote with this algorithm before, I have some observations. First off, I ported it to Haskell to see if I could use it. I quickly realized that I was going to end up doing more work with that version that what we've been using. The reason why is that even if an accumulator is implemented, I would still have to transverse nearly every single number (the only exceptions would be when the upper limit is an even power of 10). Even though for a single number that is way faster, adding the extra math while still having to transverse nearly every number will slow it down. I may be wrong about this, but I couldn't see how to escape it.
Last edited by skottish (2014-08-17 21:36:15)
Offline
There is an algorithm that calculates f(n) directly (without calculating f(n-1) first, for all values of n) in O(log n) time. I spent a while inventing it last week, left it on a desk in another state and really don't feel like working it out again from scratch right now. But it certainly does exist, and could be used to multithread the original problem.
If somebody else cares to figure out what it was and post it before I get around to it, I won't complain :-)
Last edited by Trent (2014-08-17 23:44:36)
Offline
The reason why is that even if an accumulator is implemented, I would still have to transverse nearly every single number (the only exceptions would be when the upper limit is an even power of 10). Even though for a single number that is way faster, adding the extra math while still having to transverse nearly every number will slow it down. I may be wrong about this, but I couldn't see how to escape it.
You don't need to traverse nearly every single number...given n, you need to find the largest x = 100...000 that is smaller than n, and then traverse every number from x up to n. Trent's observation was that you can trivially calculate f(x) (well, Trent did it using x - 1), and so you don't have to traverse numbers up to x.
Here's a solution using multi-processing, with Trent's observation. It's a bit faster than falconindy's solution on my dual-core machine. It forks a new process for each order of magnitude (so a process to find candidates between 1-10, another process for candidates between 10-100, and so on).
#include <math.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <sys/wait.h>
// Look for solutions with up to N_THREADS digits
#define N_THREADS 10
long ones_in(long n){
long ones = 0;
while(n){
if(n % 10 == 1) ones++;
n /= 10;
}
return ones;
}
int main(){
pid_t pids[N_THREADS] = {0};
for(int exp = 0; exp < N_THREADS; exp++){
long start_n = pow(10, exp);
long end_n = pow(10, exp + 1);
pid_t pid = fork();
if(pid != 0){ // top-level parent process
pids[exp] = pid;
continue;
}
// child worker process
long ones = (start_n / 10) * exp; // f(999...999)
for(long n = start_n; n < end_n; n++){
ones += ones_in(n);
if(ones == n) fprintf(stderr, "%ld\n", n);
}
exit(0);
}
int status;
for(int i = 0; i < N_THREADS; i++)
waitpid(pids[i], &status, 0);
}
Note that this will print the numbers out-of-order, so don't kill it when it prints out 1111111110. You need to wait for it to print out 535200001, which is an order of magnitude smaller but takes longer to calculate.
Time to reach 1111111110
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
karkhaz multiprocess C: 00m 24s
falconindy C: 00m 31s
skottish regular-int Haskell: 00m 36s
karkhaz regular int Ocaml: 03m 35s
skottish big-int Haskell: 04m 46s
karkhaz big-int Ocaml: 43m 57s
Trent, I had a half-formed algorithm in my mind which would work in O(log n) time (if it works at all); perhaps it might be similar to what you were thinking of. The general idea is that I start at the largest 100...000 lower than n, as before, and then 'zoom in' to n in large steps rather than one at a time (using binary partitioning).
Anyway, the code I wrote in post #5 (based on your idea) for working out a particular n is already super-fast even for huge n. We'd have to use abitrary-size ints in Haskell or whatever to demonstrate the advantages of an O(log n) algorithm
Last edited by karkhaz (2014-08-18 00:41:32)
Offline
Here is more or less the algorithm I was thinking of. It's actually not that different in concept from the "zooming in" idea, but it works inside out: first it counts all the ones that would be in the ones' place when written out, then all the ones that would be in the tens' place, etc. whereas your proposed method would start with the highest order of magnitude and get closer by degrees. This algorithm works by partitioning the number at each step into three pieces like so:
1 0 1(4) # 101 "ones", plus 1 because 4 > 1
1 0(1)4 # 10*10 "ones", plus 4+1 because 1 = 1
1(0)1 4 # 1*100 "ones", plus nothing because 0 < 1
(1)0 1 4 # 0*1000 "ones", plus 14+1 because 1 = 1
At each step the digit in (parens) gets prepended to the remainder. The remainder is only used in calculation when the digit under examination is exactly 1.
The following is a linear recursive implementation in Python. It calculates f(2**3300) on my machine without breaking a sweat or detectable latency. Much larger than that and I run into recursion depth limits. Tail call optimization would fix that.
def f(n, place=1, rmdr=0, accum=0):
if n == 0: return accum
new_n, digit = divmod(n, 10)
tmp = place*new_n
if digit == 1:
tmp += rmdr + 1
elif digit > 1:
tmp += place
return f(new_n, place*10, digit*place + rmdr, accum + tmp)
Considering arbitrary precision arithmetic, I think this is actually O(log^2 n). It's still not going to beat a straightforward counting solution for enumerating all of them, but perhaps there's some extra cleverness that could be applied, like sampling f(n) every so often and singling out for closer examination intervals where (f(n) - n) changes sign.
Here's a puzzler for the mathematicians: Are there infinitely many n > 0 such that f(n) = n? I suspect there are, and furthermore I suspect there are infinitely many n > 0 such that f(n) != n, but I couldn't prove either assertion.
(edit) Hmm, on reflection, I'm not sure whether that makes sense or not. Still interesting to think about.
Last edited by Trent (2014-08-18 12:48:41)
Offline
It's still not going to beat a straightforward counting solution for enumerating all of them, but perhaps there's some extra cleverness that could be applied, like sampling f(n) every so often and singling out for closer examination intervals where (f(n) - n) changes sign.
The general idea would be to try galloping search to find a and b such that (f(a) - a) and (f(b) - b) have opposite sign, then bisect the interval repeatedly to find zeros.
Here's a puzzler for the mathematicians: Are there infinitely many n > 0 such that f(n) = n? I suspect there are, and furthermore I suspect there are infinitely many n > 0 such that f(n) != n, but I couldn't prove either assertion.
If there are infinitely many n such that f(n) = n then there are also infinitely many such that f(n) != n because there are many intervals where f(n) grows faster than n (whenever n has multiple 1s in it).
Offline
I have been thinking about this off and on since I saw this thread yesterday. Here's my algorithm using slopes to jump ahead. My count_ones function counts the total number of "1"s for a given number without cumulation.
The algorithm is fresh-baked and unrefined. I expect that the next slope at each jump could be chosen more efficiently. There are probably also some numerical tricks that can be used to speed up some calculations that I have done naively at the moment.
I have used unsigned ints for now. At some point I hit an integer overflow error which could be avoided using unsigned long ints. I have changed the algorithm since then so I'm not sure if I still hit the overflow or if the higher orders just take too much time. I haven't bothered debugging it yet.
Note that within an order of magnitude n (10^n), the rate of increase of "1"s ranges from n to 0 but averages to n/10, which means that for some x above n=10, the "1"s accumulate faster than than the numbers increase. The total number of solutions is therefore finite.
Here's my code (v8)
#include <stdlib.h>
#include <stdio.h>
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#define NEXT(x, y, k) ((unsigned long int) ((((k)*(x)) - (y)) / ((k) - 1)))
/*
Count all "1"s required to write the sequence of numbers up to the given
number.
The number of "1"s in 10^n is n(10^n)+1 (+1 for the leading "1").
*/
unsigned long int
count_ones(
unsigned long int n,
unsigned int * order,
unsigned int * digit
)
{
unsigned long int ones = 0, remainder = 0, t = 0;
*order = -1;
while (n)
{
++(*order);
*digit = (n % 10);
if (t)
{
ones += *digit * (*order * t);
t *= 10;
}
else
{
t = 1;
}
if (*digit == 1)
{
ones += remainder + 1;
}
else if (*digit > 1)
{
ones += t;
}
remainder += *digit * t;
n /= 10;
}
return ones;
}
int
main(int argc, char * * argv)
{
double x, y, k;
unsigned long int x0, y0, x1, y1 = 0, next_range, previous_range = 1;
unsigned int order = 0, previous_order = 0, digit;
int i;
// Code for testing the count_ones function.
if (argc > 1)
{
for (i = 1; i < argc; i++)
{
x0 = atoi(argv[i]);
printf("%lu -> %lu\n", x0, count_ones(x0, &order, &digit));
}
return EXIT_SUCCESS;
}
/*
Print out the first solution here and skip to the first order of magnitude
to reduce the number of checks needed (k > 1).
*/
printf("1\n");
x0 = 10;
/*
After the order reaches 10, the number of "1"s grows too fast for the number
to catch up.
*/
while (order <= 10)
{
y0 = count_ones(x0, &order, &digit);
/*
Print the solution and check if the next number is also a solution.
*/
if (x0 == y0)
{
printf("%lu\n", x0);
x0++;
continue;
}
/*
Because the "1"s accumulate (i.e. the sequence is non-decreasing), no
value of x0 below y0 at this point can have as many "1"s. Jump ahead to
y0.
*/
else if (y0 > x0)
{
x0 = y0;
continue;
}
/*
The next range is the next sequence of numbers with a leading "1". This
range will increase the maximum slope by 1.
*/
next_range = previous_range;
// ">=" to reach the next order, i.e. "(order + 1) > previous_order"
while (order >= previous_order)
{
next_range *= 10;
previous_order++;
}
previous_range = next_range;
x = (double) x0;
y = (double) y0;
/*
The maximum slope is the order + 1 because this is the maximum number of
"1"s that may appear in the number of that order. Here the slope is set to
the order and the +1 is conditionally added below if the current number
begins with "1".
*/
k = (double) order;
/*
If we are current in a range of numbers beginning with "1" and x is less
than f(x) then f(x) is increasing at least as fast as x. Jump ahead to
the next number with a leading "2".
*/
if (digit == 1)
{
next_range /= 5;
k += 1.0;
}
// Use the slope to find the next possible candidate in the current range.
x1 = NEXT(x, y, k);
// Jump to the next range if it begins before the candidate.
x1 = MIN(x1, next_range);
// Make sure that x0 always increases.
x0++;
x0 = MAX(x0, x1);
}
// Print out information about where the algorithm stopped.
printf("\nStopped at %lu.\n", x0);
y0 = count_ones(x0, &order, &digit);
y1 = y0;
if (y1 > x0)
{
// +1 to ensure that an order of 10 implies a difference >10^10.
y1 -= (x0 + 1);
order = 0;
while (y1 /= 10)
{
order++;
}
if (order >= 10)
{
printf(
"f(%lu) - %lu =\n"
"%lu - %lu =\n"
"%lu >\n"
"10^10 => No more solutions are possible.\n",
x0, x0,
y0, x0,
y0 - x0
);
return EXIT_SUCCESS;
}
}
printf("More solutions may be possible (increase the maximum order to find them).\n");
return EXIT_SUCCESS;
}
Compiled with
gcc -Wall -O3...
Results:
$ time ./a.out
1
199981
199982
199983
199984
199985
199986
199987
199988
199989
199990
200000
200001
1599981
1599982
1599983
1599984
1599985
1599986
1599987
1599988
1599989
1599990
2600000
2600001
13199998
35000000
35000001
35199981
35199982
35199983
35199984
35199985
35199986
35199987
35199988
35199989
35199990
35200000
35200001
117463825
500000000
500000001
500199981
500199982
500199983
500199984
500199985
500199986
500199987
500199988
500199989
500199990
500200000
500200001
501599981
501599982
501599983
501599984
501599985
501599986
501599987
501599988
501599989
501599990
502600000
502600001
513199998
535000000
535000001
535199981
535199982
535199983
535199984
535199985
535199986
535199987
535199988
535199989
535199990
535200000
535200001
1111111110
Stopped at 125521283796.
f(125521283796) - 125521283796 =
171546411754 - 125521283796 =
46025127958 >
10^10 => No more solutions are possible.
real 0m0.001s
user 0m0.000s
sys 0m0.000s
The time varies between 1 and 4 ms.
Btw, if you pass it numbers on the command line then it will print out the number of "1"s for each given number.
edit 1
I have tweaked the algorithm and updated the code and results. The code now uses unsigned long ints.
If fails to detect some numbers so the algorithm obviously needs further tweaking.
edit 2
Previous errors were due to floating point rounding errors. Changing floats to doubles along with another tweak to the algorithm seems to have fixed it. Can someone confirm that it finds all solutions?
edit 3
v6: stopping data
edit 4
v7: simplified code, added comments, removed debugging statements
edit 5
v8: reduced iteration by adjusting k value based on first digit
Last edited by Xyne (2014-08-25 18:49:43)
My Arch Linux Stuff • Forum Etiquette • Community Ethos - Arch is not for everyone
Offline
Can someone confirm that it finds all solutions?
Your output matches up with mine.
Offline
Xyne wrote:Can someone confirm that it finds all solutions?
Your output matches up with mine.
Thanks!
My Arch Linux Stuff • Forum Etiquette • Community Ethos - Arch is not for everyone
Offline
Note that within an order of magnitude n (10^n), the rate of increase of "1"s ranges from n to n/10, which means that for some x above n=10, the "1"s accumulate faster than than the numbers increase. The total number of solutions is therefore finite.
While this makes logical sense and agrees with everything I've found so far, it's not sufficient evidence because there are an infinite number of intervals over which the slope is 0. (For example, [23456789098765432, 23456789098765440] is such an interval.) Knowing the average rate of accumulation over an order of magnitude isn't enough to tell you that f(n) is never equal to n over that interval. But I guess the conclusion is correct regardless.
Offline
Xyne wrote:Note that within an order of magnitude n (10^n), the rate of increase of "1"s ranges from n to n/10, which means that for some x above n=10, the "1"s accumulate faster than than the numbers increase. The total number of solutions is therefore finite.
While this makes logical sense and agrees with everything I've found so far, it's not sufficient evidence because there are an infinite number of intervals over which the slope is 0. (For example, [23456789098765432, 23456789098765440] is such an interval.) Knowing the average rate of accumulation over an order of magnitude isn't enough to tell you that f(n) is never equal to n over that interval. But I guess the conclusion is correct regardless.
I disagree. The average rate over order n is n/10. The only thing missing is to show that at some point (f(x) - x) > 10^10. After that point, regardless of however many intervals of 0 slope there are, they are not long enough to close the difference of 10^10. For every subsequent interval of 10^10 numbers ([x, x+10^10]) at least 10^10 "1"s will accumulate. The numbers cannot catch up. v6 of my script includes stopping data which shows that this condition is met:
...
535199990
535200000
535200001
1111111110
Stopped at 108265749976.
f(108265749976) - 108265749976 =
126782229975 - 108265749976 =
18516479999 >
10^10 => No more solutions are possible.
My Arch Linux Stuff • Forum Etiquette • Community Ethos - Arch is not for everyone
Offline
I disagree. The average rate over order n is n/10. The only thing missing is to show that at some point (f(x) - x) > 10^10. After that point, regardless of however many intervals of 0 slope there are, they are not long enough to close the difference of 10^10. For every subsequent interval of 10^10 numbers ([x, x+10^10]) at least 10^10 "1"s will accumulate. The numbers cannot catch up.
Ok, I'm happy now
Offline
I am clearly not making the most of my time, but I have made some notes in case anyone is interested. It is not necessary to find f(x) for which f(x) - x > 10^10. Its existence is ensured by the average rate surpassing 1.
edit: cleaned up after the brainfart that led to "notes in anyone is interesting".
Last edited by Xyne (2014-08-19 17:05:36)
My Arch Linux Stuff • Forum Etiquette • Community Ethos - Arch is not for everyone
Offline
FINAL EDIT: please ignore this post. It turns out it's complete nonsense that is evidence of only two things: 1) my embarrasment that I generally consider myself quite mathematically minded, yet post this; and 2) I should never ever try to tackle such problems before the morning coffee.
-----
Perhaps another way to conceptualize Xyne's solution is that in an infinite set of rational numbers each digit is represented uniformly at each place. In otherwords, the number '1' shows up as often in the ones place as '2' does and '3' does, etc. Specifically, each numeral shows up 1/10 times in each place. So for the set of all numbers greater than ten digits, there are more than ten places for each of these ten numerals to be distributed.
In fact, we should be able to conclude just frm this, that among 10 digit numbers, the total number of 1s used to write those numbers should be exactly equal to the number of numbers (i.e. f(10^10) = 10^10). [edit: I deleted this once when I second guessed it, then reconfirmed it: 10^10 distinct numbers, each with 10 digits (leading zeros are counted) and each place holding a '1' one out of ten times, so 10^10 numbers * 10 digits * 1/10 ones-per-digit = 10^10]
The further beyond 10 digits one goes, the faster this divergence of f(x) > x will be as there are more places in which 1s will be uniformly distributed.
Or, perhaps this is exactly how Xyne was conceptualizing it to start with and I'm just catching up.
EDIT: now I'm second guessing again. I'm convinced by the logic of this approach - but I realized the actual number would be different. The largest 10-digit number is not 10^10 of course, but 10^11 - 1. So it is f(10^11-1) that should be equal to 10^10.
Last edited by Trilby (2014-08-19 13:11:53)
"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" - Richard Stallman
Offline
I thought falconindy's solution was a good starting point for me to experiment a little with Rust. The following is a near-straight port of falconindy's C:
/// `ones(n)` calculates and returns the number of ones in the decimal
/// representation of the natural number `n`
fn ones(n: int) -> int {
let mut tmp = n;
let mut result = 0;
while tmp > 0 {
if tmp%10 == 1 {
result += 1;
}
tmp /= 10;
}
return result;
}
fn main() {
let mut total = 0;
let mut n = 0;
while n < 1111111110 {
n += 1;
total += ones(n);
if n == total {
println!("f({}) = {}", n, total);
}
}
}
Rust's int is 64 bits wide. I changed the types in each version (and added the termination condition to the C version) to do comparisons for four different integer types:
C Rust delta
signed 32 29.98 31.37 +1.39 (4.6% slower)
signed 64 35.51 38.46 +2.95 (8.3% slower)
unsigned 32 26.17 26.90 +0.73 (2.8% slower)
unsigned 64 33.80 32.36 -1.44 (4.3% faster)
I'm guessing the C compiler can make the signed versions faster partly by assuming that undefined behavior never happens, in a way that Rust isn't permitted to. I was more surprised by the fact that Rust with uint actually beats C with uint64_t.
I also wrote a linear recursive ones() function to see the compiler TCO it. This uint version runs in the same time as the above:
fn ones(n: uint, ac: uint) -> uint {
if n < 1 {
return ac;
}
let (q, d) = (n/10, n%10);
let tmp = if d == 1 { 1 } else { 0 };
return ones(q, ac + tmp);
}
I think it's possible to make it perform better, though. In any case, I'm satisfied that Rust's claim to zero-cost abstractions isn't all wind. Thanks for the interesting problem karkhaz!
Rust is under development, has bugs, may eat your laundry, etc. Programs were compiled with `gcc -O2` and `rustc --opt-level 2`.
Offline
Thanks for the interesting problem karkhaz!
No doubt! This is a fun one. And I'm really hoping that more people get involved with programming questions here; Stack Overflow, while great, is a little strict for my tastes.
Anyway, this Haskell version is way slower than my last one. The point being is that I implemented my own function that calculates all of the ones in one pass. It calculates the the top positive 64 bit int instantly:
***EDIT***
My last implementation of the 'ones' function was using exponents for the 10's and not simple multiplication. This one is 7x faster than the other and is no longer horrifyingly slow. It's still slower than my first version, but that's the next problem...
import Control.Monad
ones :: Int -> Int
ones n
| n == 0 = 0
| n < 10 = 1
| digit == 0 = go next 0 1 1
| digit /= 0 = go next 1 1 1
where
(next, digit) = n `quotRem` 10
num = n
go n acc pass tens
| n == 0 = acc
| n /= 0 = go next (acc + result) (pass + 1) (tens * 10)
where
(next, digit) = n `quotRem` 10
p = pass * tens
t = 10 * tens
result
| digit == 0 = 0
| digit == 1 = p + num `rem` t + 1
| digit /= 1 = p * digit + t
matches :: [Int]
matches = go 1
where
go n
| n == 1111111111 = []
| n == ones n = n : go (n + 1)
| otherwise = go (n + 1)
main :: IO ()
main = mapM_ (\x -> print x) $ matches
Last edited by skottish (2014-08-24 19:17:45)
Offline
Trent wrote:Thanks for the interesting problem karkhaz!
No doubt! This is a fun one. And I'm really hoping that more people get involved with programming questions here; Stack Overflow, while great, is a little strict for my tastes.
Cheers. I haven't replied for a few days because I'm super-impressed with xyne's solution, and want to devote a few hours to trying to understand it. I'll just note that this problem came from some kind of parody entrance exam for job candidates at Google...if any of you are between jobs at the moment, you might want to send your CV and a link to this thread to the folks at Mountain View
edit: Just to make it clear, I'm not a head-hunter for google or anything. In case I implied that I was
Last edited by karkhaz (2014-08-24 02:29:41)
Offline