Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
PiCloud : Cloud Computing. Simplified. (picloud.com)
112 points by ggruschow on Feb 27, 2010 | hide | past | favorite | 41 comments


"Imagination at Work" - You probably shouldn't use that phrase next to a light bulb on your home page. I'm not even going to bother to do the search, but I'm 100% sure that GE has that trademarked.


in the documentation it is written "While the cloud module works well as a drop-in solution for pure python code, it doesn’t handle python extensions written in C/C++. If you use custom Python extensions that PiCloud does not have installed, you’ll need to upload them to PiCloud."

As I am not an advanced user of python, I find this sentence ambiguous : If I want to use PIL or the GFX from swftools.org (I need pdf to png rendering and image manipulation), they are written in C / C++. And I think they are Python extensions ?

- So can I upload them ?

- Is there a list of installed python extensions ?


Yes, on the configuration page you can upload an archive (or sync a repository point) which contains a python extension build script.



Looks nice, but I would rather have the processing power local to me. I have used the python framework, octopy. It is a mapreduce framework for python. Caveat: I have only used it for trivial problems thus far.

http://code.google.com/p/octopy/


Octopy looks far too simplistic for anything too large. For example, it pushes the entire python source file to the client, including all of the data. It then assigns a client data to process? I've been reading the code for a few minutes and can't imagine using it for anything substantial...

I too prefer to have local processing, but there are times where I also like to farm out jobs to a larger HPC cluster (I'm at a university with 2 large clusters). The downside to this is that I usually have to wait in a job queue.

I guess what I really want is a hybrid approach where I can run jobs locally, but if there are too many, spin up either a loadleveller / pbs job on a university cluster or spin up a few EC2 instances.


Nice. Currently using GAE...how do you plan to handle CDNs?

Also, should i want to migrate my data from Google's Datastore, how easy would that be?

What do you have that GAE doesn't?

I'm asking on behalf of myself, mostly, but I imagine there are others here deploying mission critical stuff on app engine.


I'm not involved with them in any way, but their faq says that though their product is amazon specific ATM if you want to work with rackspace or a private install to contact them directly.


If you're familiar with Python, this is essentially outsourcing the Processing module.


Aa well as providing a transparent computing platform. This is a much better alternative for my batch processing than keeping my laptop on overnight.


You would still need to keep your laptop on as you wait for your function call to return.


PiCloud dev here.

You don't need to. As long as you save the returned job id (what is returned by cloud.call and cloud.map), you can access its return value at any future time - from any computer.


I stand corrected. Thank you. :)


SWEET site guys. This is exactly what I was looking for. Thanks for providing such a great service that I know many developers were looking for.


Looks very interesting! Is there a similar module available for doing something like this locally?


The cloud package includes a module 'cloud.mp' (http://docs.picloud.com/client_adv.html#cloud-mp) which allows the cloud semantics to be used locally (using python's multiprocessing library).



Are you using boto on the backend?


PiCloud dev here.

Nope, but we probably should.


How they scale automatically on python (a non-functional language)? I am puzzled.


They probably spawn several python processes


Indeed, seems like a single cloud.call will not give you any parallelism, you need many calls. So the whole parallelization challenge is left completely up to you. Still, this is a nice approach, and I'd like to see more companies in this space.


Agreed, a very cool idea. I believe you are right on the parallelism front, cloud.call just offloads the processing on their servers, but you'd have to call it multiple times to get any parallelism.


PiCloud dev here.

That is correct. We also offer a mapping function, cloud.map(func, arg_list), where every func(arg_list[i]) will be evaluated in parallel.


This is just a way to call a function and run it on a different server(s). You still have to design your program to handle parallelism.


Python is a functional language in the sense that it has first class functions and you can pass them around.


Yes but it's not purely functional in that you cannot automagically make a script scale to n threads, as is the case with Haskell. This is what the OP is referring to.


You can't really do that either in Haskell. Well, in theory you can, but in practice nobody has yet written a way to autoparallelize Haskell that actually gets you reasonable speedups, because it requires predicting which parts are worth parallelizing.

Instead, the Haskell devs are mostly taking the same approach, of providing constructs for the programmer to explicitly write parallel code: http://www.haskell.org/ghc/docs/6.6/html/libraries/base/Cont...


Correct. cloud.call, etc. are all higher-level functions.

PiCloud does treat cloud.call(func) dispatches as completely functional. The function evaluated cannot have side-effects in terms of the program space; within the program, you access its output via cloud.result(). (Note: You can within the func have external side-effects - e.g. opening a connection to a database and modifying entries).


- edit : nevermind, found the answer in the 'pitfalls' doc : yes you can create picloud call from within picloud, but they of course go against your maximum count of parallel calls -

Hi

I have read (quickly) your webscraper example and i still have one question : Is it possible to launch piclouds jobs from within a picloud job ? i.e. from picloud download some list of url as csvs and then start webcrapers then have the original function return the results of these scrapers ? Thanks for your answer.


any beta code love?


PiCloud dev here.

Don't worry about the beta codes - you'll be approved in FIFO order.


this is great. I've been playing around with a similar tool called Monkey Analytics (www.monkeyanalytics.com), but they're still working out the kinks with their parallelism


Why doesn't anyone like PHP?


Such an unfortunate question. Simple answer: There are lots of problems with the design of the language that people like to get snobbish about.


how's that namespace working out for you?


Who needs namespaces when you have a 'quick reference'?

http://php.net/quickref.php


All programming languages should have a quick reference page exactly like this. It would make learning and working with them so much easier.


A lot of them do, except you can't just call the functions, you usually have to import the right thing and then instantiate the right object first. The reason why this works so well is also one of the limitations of PHP.


yay for the index! lol. (ahh i shouldn't post this)

http://docs.python.org/genindex-all.html


Wait, so how do I get my cloud of pie? What flavors do you do?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: