Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: JRuby 1.6
    • Fix Version/s: None
    • Component/s: Performance
    • Labels:
      None
    • Number of attachments :
      0

      Description

      Another case where we are comparing 100% "native" impls, this time with String#to_i.

      Results are mixed; we are all fastest on something. The best of all worlds will make someone fastest on all.

      ~/projects/rubinius ➔ jruby --server -I benchmark/lib/ benchmark/core/string/bench_to_i.rb 
      #to_i with an integer in a string
                            7633077.1 (±7.4%) i/s -   37354603 in   4.934960s (cycle=29623)
      #to_i with a float in a string
                            6733175.0 (±4.9%) i/s -   33241488 in   4.954528s (cycle=31242)
      #to_i with an empty string
                           11830673.8 (±5.1%) i/s -   58289616 in   4.942069s (cycle=47313)
      #to_i with an integer and extra text
                            3821345.2 (±5.7%) i/s -   18957678 in   4.981885s (cycle=41757)
      
      ~/projects/rubinius ➔ ruby1.9 -I benchmark/lib/ benchmark/core/string/bench_to_i.rb 
      #to_i with an integer in a string
                            7261663.4 (±1.5%) i/s -   36318960 in   5.002643s (cycle=55280)
      #to_i with a float in a string
                            7046664.4 (±1.6%) i/s -   35246844 in   5.003213s (cycle=59139)
      #to_i with an empty string
                            9097651.9 (±1.6%) i/s -   45490088 in   5.001590s (cycle=59542)
      #to_i with an integer and extra text
                            3204945.1 (±9.6%) i/s -   15882468 in   5.002829s (cycle=54022)
      
      ~/projects/rubinius ➔ bin/rbx -I benchmark/lib/ benchmark/core/string/bench_to_i.rb 
      #to_i with an integer in a string
                            7926463.3 (±3.7%) i/s -   39419937 in   4.983585s (cycle=38761)
      #to_i with a float in a string
                            7818289.9 (±1.9%) i/s -   38956995 in   4.984837s (cycle=33729)
      #to_i with an empty string
                            7941361.7 (±4.5%) i/s -   39475787 in   4.987574s (cycle=42769)
      #to_i with an integer and extra text
                            7955217.3 (±2.2%) i/s -   39643632 in   4.987127s (cycle=42264)
      

      Here's the bench from Rubinius's suite:

      require 'benchmark'
      require 'benchmark/ips'
      
      Benchmark.ips do |x|
        int = "5"
        float = "5.0"
        empty = ""
        with_extra_text = "5 and some extra characters"
      
        x.report "#to_i with an integer in a string" do |times|
          i = 0
          while i < times
            int.to_i
            i += 1
          end
        end
      
        x.report "#to_i with a float in a string" do |times|
          i = 0
          while i < times
            float.to_i
            i += 1
          end
        end
      
        x.report "#to_i with an empty string" do |times|
          i = 0
          while i < times
            empty.to_i
            i += 1
          end
        end
      
        x.report "#to_i with an integer and extra text" do |times|
          i = 0
          while i < times
            with_extra_text.to_i
            i += 1
          end
        end
      end
      

        Activity

        Hide
        Charles Oliver Nutter added a comment -

        Ok, we are now fastest on all but the last one, and there's a simple reason for it.

        The logic in ConvertBytes.bytelistToInum tries to determine before a full parse whether the result will fit in Long.SIZE digits (64). In the last case, where there's trailing garbage, the garbage makes it seem like it could be longer, and so the logic ends up falling on BigInteger parsing. This creates an intermediate BigInteger that ends up normalizing back to Fixnum anyway.

        This would be fixable by doing a better job calculating which characters will actually be used for the resulting number, or by going ahead and calculating the result and checking how many bytes were actually used before failing over on the BigInteger path.

        Show
        Charles Oliver Nutter added a comment - Ok, we are now fastest on all but the last one, and there's a simple reason for it. The logic in ConvertBytes.bytelistToInum tries to determine before a full parse whether the result will fit in Long.SIZE digits (64). In the last case, where there's trailing garbage, the garbage makes it seem like it could be longer, and so the logic ends up falling on BigInteger parsing. This creates an intermediate BigInteger that ends up normalizing back to Fixnum anyway. This would be fixable by doing a better job calculating which characters will actually be used for the resulting number, or by going ahead and calculating the result and checking how many bytes were actually used before failing over on the BigInteger path.

          People

          • Assignee:
            Unassigned
            Reporter:
            Charles Oliver Nutter
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: