Python questions

Just getting a sense of it. That is all. Seems like there are many ways to do the same thing. What got me going was the lack of tools for text in G’MIC.

@afre Regex is another language. It’ll take a while for one to explain. Better yet, try to learn it if you want to do this. I want to learn Regex, but already have too much on my hand.

I only know very, very simple regex, usually relying on online sources to solve my problems. The issue is though that there are different forms of regex, which aren’t compatible.

Here’s a manual for regex. Doesn’t help a lot in deciphering that much, but anyways, it’s a start. Only practice helps. Lots of it.

I did it, using my previous strategy. It didn’t work at first because of a typo.

Input:

import re

text = '''One fish, Two fish, Red fish, Blue fish,
Black fish, Blue fish, Old fish, New fish.
This one has a littlecar.
This one has a little star.
Say! What a lot of fish there are.
Yes. Some are red, and some are blue.
Some are old and some are new.
Some are sad, and some are glad,
And some are very, very bad.
'''

l = '::'.join([line for line in text.splitlines()])

count=-1
replaced=l
while count!=0:
  replaced,count=re.subn(r'(\b\w+?\b)(.+)(\b\1\b)(.+)(\b\1\b)(.*)',r'\1\2\3\4***\6',replaced,flags=re.IGNORECASE)

l = replaced.replace('::','\n')

Result:

One fish, Two fish, Red ***, Blue ***,
Black ***, Blue ***, Old ***, New ***.
This one has a littlecar.
This *** has a little star.
Say! What *** lot of *** there are.
Yes. Some are red, and some *** ***.
*** *** old and *** *** new.
*** *** sad, *** *** *** glad,
*** *** *** very, very bad.

Take the “Owl book”. The first edition did have a chapter on Python, which was removed in later editions. But you don’t care, because these are small implementation/usage details, and the book is very good at explaining the basics (and also some not so basic stuff).

This is a very useful skill, and I have several times replaced several dozens of lines of bad code with a Regex (in Python, Java, Bash, C++…).

1 Like

Any idea how I can convert this to c++ code?

def xelf_factors(n):
    pf = primefactors(n)
    af = { reduce(mul,x) for z in range(1,len(pf)) for x in combinations(pf,z) }
    return sorted({1,n}|af)

def primefactors(n):
    factors,d = [],2
    while n > 1:
        while n%d==0:
            factors.append(d)
            n//=d
        d+=1
    return factors

That is the fastest way to find factors of number.

My C++ is rusty, and this is about Python.

If instead of keeping the factors as a plain sequence you create a (factor,power) sequence, then the output of xelf_factors is just a matter of iterating all factors:

def primespowers(n):
    primesWithPower=[]
    d=2
    while n > 1:
        power=0
        while n%d==0:
            n//=d
            power+=1
        if power!=0:
            primesWithPower.append((d,power))
        d+=1
    return primesWithPower

def products(primesWithPower):
    if len(primesWithPower)==0:
        return [1]
    result=[]
    divisor,maxpower=primesWithPower[0]
    others=products(primesWithPower[1:])
    for pow in range(0,maxpower+1):
        result.extend([x*(divisor**pow) for x in others])
    return result
    
def divisors(n):
    primesWithPower=primespowers(n)
    return products(primesWithPower)

You can recurse without fear, if your max integer is 2^64-1, you cannot have more that 63 divisors (this also means that you can allocate an array of 63 factors at the beginning).

In this form the conversion to C or C++ is a lot more straightforward.

Well, I found the closest to manual conversion:

 all possible combinations of those prime factors.

Here's a implementation without much optimization using uint64_t instead of multiprecision that completes within 305 ms for input 10,000,000,000,000,000 on my machine.

Note that the preformance will get significantly worse for a larger number of distinct prime factors. (12132 ms for the product of the smallest 14 primes). This is caused by the fact that there are just more combinations to calculate/print.

#include <chrono>
#include <iostream>
#include <utility>
#include <vector>

using PrimeFactors = std::vector<std::pair<uint64_t, uint64_t>>;

std::vector<std::pair<uint64_t, uint64_t>> FindFactors(uint64_t n)
{
    PrimeFactors primeFactors;

    uint64_t square = static_cast<uint64_t>(std::sqrt(n));
    for (uint64_t i = 2; i <= square && i <= n; ++i)
    {
        bool isPrime = true;
        for (auto [prime, exponent] : primeFactors)
        {
            if (prime * prime > i)
            {
                break;
            }
            if (i % prime == 0u)
            {
                isPrime = false;
                break;
            }
        }

        if (isPrime)
        {
            uint64_t count = 0;
            while (n % i == 0)
            {
                ++count;
                n /= i;
            }
            primeFactors.emplace_back(i, count);
            if (count != 0)
            {
                square = static_cast<uint64_t>(std::sqrt(n));
            }
        }
    }
    if (n != 1)
    {
        primeFactors.emplace_back(n, 1);
    }
    return primeFactors;
}

void PrintFactors(uint64_t factor, PrimeFactors::const_iterator pos, PrimeFactors::const_iterator const end)
{
    while (pos != end)
    {
        while (pos != end && pos->second == 0)
        {
            ++pos;
        }
        auto newFactor = factor;
        for (auto count = pos->second; count != 0; --count)
        {
            newFactor *= pos->first;
            std::cout << newFactor << '\n';
            PrintFactors(newFactor, pos + 1, end);
        }
        ++pos;
    }
}

int main()
{
    using Clock = std::chrono::steady_clock;

    uint64_t const input = 10'000'000'000'000'000ull;
    //uint64_t const input = 2ull * 3ull * 5ull * 7ull *11ull * 13ull *17ull * 19ull * 23ull * 29ull *31ull*37ull * 41ull*43ull;

    auto start = Clock::now();
    auto factors = FindFactors(input);

    // print
    std::cout << 1 << '\n';
    PrintFactors(1, factors.begin(), factors.end());
    auto end = Clock::now();
    std::cout << "took " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << " ms\n";
}

My main goal is to convert the fastest solution to G’MIC. Will try yours too.

LOL - I will add a C++ tag to this thread.

To answer @Reptorian’s previous question of why I am interested in Python (again; I used Python 2 for some image processing a long time ago but forgot most of it): I enjoy the Jupyter notebook concept where I can sort of see the code in action.

I would love to see some of those work, but not on this thread.

Ok, looks like I will improve upon the current python script I use for incrementing number.

Here’s the code:

import re
test_string="variable_a=$7 variable_a=$18 variable_b,variable_8=${19-20} variable_d=$21 blur 10 ${10=}"
y = re.split(r'(?:\${?)(\d+)\-*(?:(\d+)(?:\}))*', test_string)

print(y)

# ['variable_a=', '7', None, ' variable_a=', '18', None, ' variable_b,variable_c=', '19', '20', ' variable_d=', '21', None, ' blur 10']
# Note that '$'s are gone, and ${ - }s are gone as well. 19 and 20 is supppose to be in the form of ${19-20}. 7 is suppose to be in the form of $7.

The problem is in the comment. The goal is to change all numbers that match the specified regex case as long as they are greater than a number, and the change would be adding by a number (subtraction too).

Also, I will be keeping these two regex code here for use later:

# These will be used for verifying a string is in this form, so I can extract number and add numbers.
\$(\d+)
\$\{\d+\-+\d+\}

I don’t know if these two regex cases are needed.

Do you want the result as a list or do you want the initial string with the numbers updated?

Initial string with numbers updated. Also, I updated the test_string. ${=v} case should be factored in as well, but that is detected by the regex code.

I don’t see a {=v} just a ${v=} ({10=}).

${v=} is correct. My bad.

@Ofnuts I’m so close, but I had help from python discord. Now, here’s the current code.

from functools import partial
import re
import pyperclip as pc


test_string="variable_a=$7 variable_a=$18 variable_b,variable_8=${19-20} variable_d=$21 blur 10 ${10=}"

# gmic_str_paste=pc.paste()
# gmic_str=str(gmic_str_paste)

pattern = re.compile(r"(?<=\$)(?:\{(\d+)-(\d+)}|(\d+))|\$\{(\d+)=\}")

def addfunc(m, *, n, v):
    def incif(num):
        return num + n if num > v else num

    a, b, c , d = m[1], m[2], m[3], m[4]
    if a and b:
        a = incif(int(a))
        b = incif(int(b))
        return f'{{{a}-{b}}}'
    elif c:
        c = incif(int(c))
        return f'{c}'
    elif d:
        d = incif(int(d))
        return f'${{{d}}}'


add = partial(addfunc, n=1, v=4)
out_string = pattern.sub(add, test_string)
print(test_string)
print(out_string)

It’s just case ${v=} that’s the problem.

EDIT: with return "${"+str(d)+"=}", I think this can work.

Ok, in slow-mo:

  • (?<=\$): must follow a dollar sign (a.k.a. “look-behind”)
  • (?:...): non-capturing group, just to avoid a useless match
  • (...|...|...): any of these patterns will do, matched string end up in the relevant group
  • \{(\d+)-(\d+)}: trying to match the {19-20} form (so ${19-20}), but could require escaping the closing brace
  • (\d+): matches an integer (so $7, $18)

Now where thing get bad:

  • You just want to add a new pattern in there to extract a number between a { and a '=}(the$` is taken care of by the initial look-behind
  • So no need for that new $ and just add your pattern in the main OR’ing pattern:
pattern = re.compile(r"(?<=\$)(?:\{(\d+)-(\d+)\}|(\d+)|\{(\d+)=\})")

then:

from functools import partial
import re


test_string="variable_a=$7 variable_a=$18 variable_b,variable_8=${19-20} variable_d=$21 blur 10 ${10=}"

# gmic_str_paste=pc.paste()
# gmic_str=str(gmic_str_paste)

pattern = re.compile(r"(?<=\$)(?:\{(\d+)-(\d+)\}|(\d+)|\{(\d+)=\})")

def addfunc(m, *, n, v):
    print(f'addfunc({m[1]},{m[2]},{m[3]},{m[4]})')
    def incif(num):
        return num + n if num > v else num

    a, b, c , d = m[1], m[2], m[3], m[4]
    if a and b:
        a = incif(int(a))
        b = incif(int(b))
        return f'{{{a}-{b}}}'
    elif c:
        c = incif(int(c))
        return f'{c}'
    elif d:
        d = incif(int(d))
        return f'{{{d}=}}'


add = partial(addfunc, n=1, v=4)
out_string = pattern.sub(add, test_string)
print(test_string)
print(out_string)

yields:

addfunc(None,None,7,None)
addfunc(None,None,18,None)
addfunc(19,20,None,None)
addfunc(None,None,21,None)
addfunc(None,None,None,10)
variable_a=$7 variable_a=$18 variable_b,variable_8=${19-20} variable_d=$21 blur 10 ${10=}
variable_a=$8 variable_a=$19 variable_b,variable_8=${20-21} variable_d=$22 blur 10 ${11=}
1 Like

Ok, I have a tricky problem here. Yes, I have a code that does something like this, but it is limited, and this one seem to be closer to a more reasonable code. This one doesn’t factor into whether ‘(’ or ‘{’ are next to types, and that makes it a improvement. In addition, it can find variable name and types outside of cases where they’re not next to #@gui.

Behold this code:

import re

sample_str="""#@gui Sample Code: fx_rep_sample
#@gui :Integer Value=int(0,0,5)
#@gui :Float value=float(0,0,5)
#@gui :Choices=choice(0,"First Choice","Second Choice")
#@gui :Point Location=point(50,50)
#@gui :Press this Button=button()
#@gui :Text Input=text("Here's a text")
#@gui :Checkmark=bool(0)
#@gui :Color A=color(0,50,20)
#@gui :Color B=color(210,55,180,220)
#@gui :_=note("Separated"),Variable After Comma=choice(0,"A","B")"""

result_re=re.findall(r'(\#\@gui\ :)(.*,)?(.*)=(int|choice|text|bool|color|point|button|float)',sample_str)

print(result_re[0][3])

The goal is to turn sample_str into this (and the reverse):

#@gui Sample Code: fx_rep_sample
#@gui :1.Integer Value=int(0,0,5)
#@gui :2.Float value=float(0,0,5)
#@gui :3.Choices=choice(0,"First Choice","Second Choice")
#@gui :4.5.Point Location=point(50,50)
#@gui :6.Press this Button=button()
#@gui :7.Text Input=text("Here's a text")
#@gui :8.Checkmark=bool(0)
#@gui :9.10.11.Color A=color(0,50,20)
#@gui :12.13.14.15.Color B=color(210,55,180,220)
#@gui :_=note("Separated"),16.Variable After Comma=choice(0,"A","B")

Here are some of the rules:

int => 1
float => 1
choice => 1
point => 2
button => 1
text => 1
bool => 1
color with 3 number => 3
color with 4 number => 4

Lines in which regex doesn’t find anything should not be modified.

Some note I made is that splitlines can be used to solve most problem except the last line. That’s the part I’m stumped on. Do I use regex split to solve that?

EDIT: I solved my code as evidenced in this post - Some Python-Based script to make G'MIC scripting easier - #6 by Reptorian