No speedup when using distributed memory cluster

Thread index  |  Previous thread  |  Next thread  |  Start a new discussion

RSS FeedRSS feed   |   Email notificationsTurn on email notifications   |   1 Reply   Last post: October 4, 2012 4:50pm UTC
Josh Thomas

Josh Thomas
Certified Consultant
AltaSim Technologies, LLC

May 23, 2012 1:54pm UTC

No speedup when using distributed memory cluster

I wanted to see if someone could help me determine why I am not seeing any speedup when I submit jobs on multiple nodes of my HPC cluster (using the distributed memory capability).

I understand that speedup is highly model dependent. Per the suggestions of previous discussion threads, I have tried numerous different models with different physics (both linear and non-linear problems). Also, I have tried large memory models and small memory models (also per previous thread recommendations). Below are my results using the COMSOL Model Library Example "Micromixer Cluster Version":

Micromixer_cluster.mph (mesh as given):
1 node; 1 proc.: Run time = 123 sec
1 node; 12 proc.: Run time = 38 sec
4 nodes; 12 ppn: Run time = 99 sec
8 nodes; 12 ppn: Run time = 223 sec

I see speedup when I go from 1 proc. to 12 proc. running on 1 node (shared memory), but I don't see speedup, in fact I see slowdown, when I try to distribute the job across multiple nodes (distributed memory). The step-by-step instructions said to try refining the mesh for better speedup (this was also COMSOL support's recommendation). Here are the results for 2 different refined meshes:

Micromixer_cluster.mph (refined mesh):
1 node; 1 proc.: Run time = 566 sec
1 node; 12 proc.: Run time = 130 sec
4 nodes; 12 ppn: Run time = 259 sec
8 nodes; 12 ppn: Run time = 501 sec

Micromixer_cluster.mph (super-refined mesh):
1 node; 1 proc.: Run time = 1169 sec
1 node; 12 proc.: Run time = 414 sec
4 nodes; 12 ppn: Run time = 614 sec
8 nodes; 12 ppn: Run time = 896 sec

Still no speedup. Only slowdown.

Has anyone seen any speedup on this COMSOL Model Library example? If so, I'd be interested in your results.

One thing that I am doing that is different than the COMSOL recommendation is submitting jobs through the command line rather than through the Desktop. Does anyone know why COMSOL recommends submitting batch cluster jobs through the Desktop and not through the command line? Could this be my issue?

Any help would be appreciated.

Reply  |  Reply with Quote  |  Send private message  |  Report Abuse

STS

STS
COMSOL Employee
France

October 4, 2012 4:50pm UTC in response to Josh Thomas

Re: No speedup when using distributed memory cluster

Dear Josh,

unfortunately I d not read your post earlier: hope it's not too late!
The models that you have tested are provided to test proper operations of the cluster, not performance. For performance, you'd better try to reproduce the results of the following paper, based upon models also available in the model library, for the same number of DOFs: www.comsol.fr/papers/10248/

Best regards,
Stephan

--
www.comsol.fr

Reply  |  Reply with Quote  |  Send private message  |  Report Abuse


Rules and guidelines