Free Software startups

At GUADEC last days, fun times. One thing I noticed (again) is the growth of the number of small companies, some of them driven by young guys (like, between 25 and 30) doing some really great stuff with Free Software, combining a steady income, whilst still providing valuable contributions to the community.

I’d want to write more about this seen, but here’s a quick idea: how about creating a mailing list for, at one side, these young startups or people interested in starting one, and on the other side people who already got their business running, or already established investors or managers, to discuss issues they encounter, potential business plans, how to get in contact with potential customers, how to combine open communication at one side and dealing with closed environments on the other,…

This could lower the barrier for people to start working full-time on the projects they love, it’d add value to Free Software because larger companies can start using it even more, because more expertise and consultancy is available on the market,…

I guess this is just a shot in the dark, but I think it could be pretty useful. Thoughts?

GUADEC

Leaving for GUADEC ‘08 with RubenV tomorrow. We’re staying in the Golden Horn Sultanahmet hotel. Packing starts in a minute, see you around.

Python ‘all’ odity

[update] Question solved, see bottom of post.

Since Python 2.5 the language got a new built-in method ‘all’ (and it’s nephew ‘any’). I wanted to play around with this a little, combined with generators, so I created a little testcase to test performance.

Here’s the test-case: take a list L of X random numbers in a given range [A, B], and check whether

  • all elements in L are >= A
  • all elements in L are >= (A + Z) where Z is a number in [0, (B - A)]

The first test should always result True, the second test could result to False.

Here’s the output of a test-run:

In [1]: import random, sys

In [2]: a = [random.randint(100, sys.maxint) for i in xrange(2000000)]

In [3]: len(a)
Out[3]: 2000000

In [4]: #Check whether all elements are >= 100 

In [5]: %timeit all(i >= 100 for i in a)
10 loops, best of 3: 515 ms per loop

In [6]: %timeit any(i < 100 for i in a)
10 loops, best of 3: 454 ms per loop

In [7]: def f(l):
   ...:     for i in l:
   ...:         if i < 100:
   ...:             return False
   ...:     return True
   ...: 

In [8]: %timeit f(a)
10 loops, best of 3: 292 ms per loop

In [9]: #Same thing for 100000, since now the list shouldn't be completely iterated

In [10]: %timeit all(i >= 100000 for i in a)
100 loops, best of 3: 4.73 ms per loop

In [11]: %timeit any(i < 100000 for i in a)
100 loops, best of 3: 4.29 ms per loop

In [12]: def g(l):
   ....:     for i in l:
   ....:         if i < 100000:
   ....:             return False
   ....:     return True
   ....: 

In [13]: %timeit g(a)
100 loops, best of 3: 2.82 ms per loop

In [14]: #For reference

In [15]: %timeit False in (i >= 100 for i in a)
10 loops, best of 3: 531 ms per loop

In [16]: %timeit False in (i >= 100000 for i in a)
100 loops, best of 3: 5.03 ms per loop

It’s as if ‘all’, ‘any’ or ‘in’ don’t break/return when a first occurence of False (or True, obviously) is found. Is this the desired behaviour, and if it is, why? The calculation time difference between using all/any/in or a custom-made function (which is, unlike all etc, not written in C) which breaks whenever it can, is pretty astonishing.

[update] Question solved. It’s pretty normal the function-based approach performs better, since it combines what ‘all’ and the generator provided to ‘all’ do, taking away the generator function-call overhead. Damn :-)

Python if/else in lambda

Scott, in your “Functional Python” introduction you write:

The one limitation that most disappoints me is that Python lacks is a functional way of writing if/else. Sometimes you just want to do something like this:

lambda x : if_else(x>100, “big number”, “little number”)

(This would return the string “big number” if x was greater than 100, and “little number” otherwise.) Sometimes I get around this by defining my own if_else that I can use in lambda-functions:

def if_else(condition, a, b) :
   if condition : return a
   else         : return b

Actually, you don’t need this helper if_else function at all:

In [1]: f = lambda x: x > 100 and 'big' or 'small'
In [2]: for i in (1, 10, 99, 100, 101, 110):
...:     print i, 'is', f(i)
...:
1 is small
10 is small
99 is small
100 is small
101 is big
110 is big

James, obviously you’re right… Stupid me didn’t think about that. Your version won’t work when a discriminator isn’t known at import time. But even then a function taking *args and **kwargs with a class-like name, returning a correct class instance, would cut the job.

Regarding the module/plugin stuff, I’d rather use setuptools/pkg_resources :-)

Code Review comic

OSNews: WTFs per minute

I just love this comic.

Python factory-like type instances

When designing applications or libraries, sometimes you need to be able to create instances of a certain interface (in a liberal sense) at runtime without knowing at write/compile time which specific implementation (class) you’ll need to use, as this could depend on runtime variables.

An example of this is an interface providing some functionality which should be implemented differently on different platforms, eg Linux and Windows.

There are some standard patterns how to achieve this. One of them is the factory pattern, which works somewhat like this Python example (let’s pretend ‘PLATFORM’ is ‘linux2′ or ‘win32′, ie sys.platform):

#Pretend we use sys.platform instead of PLATFORM where we use it
PLATFORM = 'linux2'

class FooBase(object):
    def say_foo(self):
        print 'foo'

class PlatformFoo(FooBase):
    def say_platform_foo(self):
        raise NotImplementedError

    @staticmethod
    def get_class():
        #Several ways to get this (dict, introspection, if-tree,...), pick yours
        klass = {
            'linux2': LinuxFoo,
            'win32': WindowsFoo,
        }.get(PLATFORM, None)
        if not klass:
            raise Exception, 'Platform not supported'
        return klass

class WindowsFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'win32 foo'

class LinuxFoo(PlatformFoo):
    def say_platform_foo(self):
        print 'linux foo'

def main():
    foo_class = PlatformFoo.get_class()
    foo = foo_class()
    foo.say_platform_foo()

if __name__ == '__main__':
    main()

Executing this code will, as expected, write ‘linux foo’ to the console. Obviously we could not return the platform-specific class in a PlatformFoo function, but an actual instance, up to you.

Python allows you to handle this situation somewhat nicer though, without introducing any intermediate functions, by using metaclasses.

Continue reading »

Funky C

It’s pretty funny to know this is valid C99 code, implementing a very basic array assignment and hello world:

%:include <stdio.h>
??=include <stdlib.h>

int main(int argc, char *argv<::>) ??<
    int i??(:> = {1, 2, 3??>;
    printf("Hello world\n");
    return 0;
%>

It’s (ab)using an obscure feature in the C89 and C99 specifications called Trigraph and Digraph, when using GCC you need to pass the ‘-trigraphs’ parameter to enable this functionality. More information can be found on the Wikipedia page about it. I wonder whether people joining code obfuscation games use this.

How not to write Python code

Lately I’ve been reading some rather unclean Python code. Maybe this is mainly because the author(s) of the code had no in-depth knowledge of the Python language itself, the ‘platform’ delivered with cPython,… Here’s a list of some of the mistakes you should really try to avoid when writing Python code:

  • Remember Python comes batteries included
    Python is shipped with a whole bunch of standard modules implementing a broad range of functionality, including text handling, various data types, networking stuff (both low- and high-level), document processing, file archive handling, logging, etc. All these are documented in the Python Library Documentation, so it is a must to browse at least through the list of available modules, so you get some notions of what you can use by default. An example: don’t introduce a dependency on Twisted to implement a very basic and simple custom HTTP server if you don’t have any performance needs, use BaseHTTPServer and derivates.
  • Python is Python, don’t try to emulate bad coding patterns from other languages
    Python is a mature programming language which provides great flexibility, but also has some pretty specific patterns which you might not know in other languages you used before.
    As an example, don’t try to emulate PHP’s ‘include’ or ‘require’ function, at all. This could be done, somewhat, by writing the code to be included (and executed on inclusion) in a module on the top level (ie. not in functions/classes/…), and using something like ‘from foo import *’ where you want this code to be executed. This will work, but it can become hard to maintain this. Modules are not meant to be used like this, so don’t. If you need to execute some code at some point, put it in a module as a function, import the function and call it wherever you want.
  • Continue reading »

VirtualBox launch script

I’ve been playing around with VirtualBox some more today (more on it, and other virtualization related stuff might follow later). I got a Windows XP instance running fine (and fast!) in it. Still got some issues (shared folders seem not to work when logged in as a non-administrator user in XP, although I tend to blame Windows for this issue, not VirtualBox), played around with a little Linux installation acting as an iSCSI target for its virtual block devices, accessing them from Windows using the Microsoft iSCSI initiator, etc. Here’s a screenshot of all this.

I added a launcher for my Windows VM to my panel, which was simply running ‘VBoxManage startvm <VM UUID>’. This was not an optimal solution though, as I wanted it to show a little error dialog when something went wrong when attempting to launch the VM (eg because I forgot it was already running), when the virtual disk files aren’t accessible (because I forgot to mount the volume they reside on), etc.

So I cooked a little script which runs some sanity checks before launching the virtual machine, and reports any errors. Here it is:

Continue reading »

Massive 70000 workstation Ubuntu deployment in France

After switching to OpenOffice in 2005 and introducing Mozilla Firefox and Thunderbird on their machines in 2006, the French paramilitary police (’Gendarmery’) will make the switch to 100% Free Software based desktops in the coming years. The migration should be completed in 2014.

All workstations will be converted to Ubuntu desktops, starting this year with 5000-8000 seats, growing to 12000-15000 over the next four years. By 2014, all 70000 (!!!) desktops should be running free software.

There are three major reasons for the full switch:

  1. Remove dependency on one single supplier
  2.  Gain full control over the whole operating system stack
  3. Reduce costs

Nowadays licensing costs sum up to 7000000€ (that’s seven million euros) every year.

I guess this must be one of the largest Linux desktops deployments ever?

Source: AFP

Next »