[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bgl-discuss] TCP socket problems on CNK



The colon hack only works on BGL CNKs.

Rob

Chad Glendenin wrote:
Doh! It turns out, it was a Makefile bogosity. I was accidentally using
blrts_xlc to build the server binary that was trying to run on login2, and
I should have been using gcc.

Is the "colon hack" something implemented specifically at ANL? I haven't
seen that before, and out of curiosity, I tried it on a random x86 Linux
box, where it didn't work.

Thanks,
ccg

On Wed, 29 Jun 2005, Rob Ross wrote:


Hey,

The CNK supports what IBM characterizes as "client" socket calls -- no
accept(), listen(), or select().

I was able to make connections from CNKs out to various nodes as well,
both login nodes IO nodes.  I did have some early issues with using
gethostbyname(), so I found it easier to simply use dot notation IP
addresses.

You may have to tell SSH what IP to listen on for forwarding; it may be
listening on the wrong device (or maybe it listens on all by default; I
don't know off the top of my head).

You may need to specify the login node IP address in dot notation to
make sure that you're getting the right one.  You want the 172.30.1.xx
address.

There's also another option that we have dubbed "the colon open hack".
You can get a TCP connection like this:

open("tcp://172.30.1.101:22", O_RDWR);

That will return a connected TCP socket connected to 172.30.1.101:22.

Keep trying; it's a pain, but it does work.

Rob

Kazutomo Yoshii wrote:

tcp connection from CNK to the front-end works for me.
Also our intern is developing an interactive tool.
He had no problem on tcp connection between cnk and login node.

Here is small codes.
http://www-unix.mcs.anl.gov/~kazutomo/bgl/cnksocket.html




According to IBM's BlueGene development manual that Andrew Siegel posting
a link to recently, client TCP calls are supported in CNK. I've been
trying to get a CNK to connect to a socket service on another machine, but
I've had no luck. The goal is to interact with a running job on BGL.

I've been using a simple test client/server setup (the code's in
bgl:/home/chad/src/commtest/). Node 0 on BGL (the client) creates a socket
and tries to connect to my server process to send a "hello world" message.
The server process just listens on a particular TCP port and prints to
stdout whatever it hears. I've been using TCP port 65002.

Here's what I've tried:

-----

1. I tried connecting directly to my server running on terra.mcs.anl.gov,
but it didn't work. Since I have no idea what the network topology
involved looks like, I haven't spent a lot of time on this case. When I
try this, I get "Connection timed out" error.

-----

2. I tried using SSH to forward port 65002 from login2.bgl to
terra.mcs.anl.gov:65002, but it didn't work. When the node tries to
connect to port 65002 on the login node, it gets "Connection refused."

I established the tunnel from login2 like this: 'ssh -Ax -L
65002:terra.mcs.anl.gov:65002 -N terra.mcs.anl.gov'

I tried the following from a shell on login2 after starting the tunnel:

2.a. login2 localhost: 'telnet 127.0.0.1 65002' works fine.
  The message shows up on the terra.mcs server.

2.b. login2 eth1: 'telnet 140.221.80.5 65002' fails. Connection refused.

2.c. login2 eth0: 'telnet 172.30.1.102 65002' fails. Connection refused.

So SSH port forwarding is apparently useless because connections to the
login node's externally-visible IP addresses are refused, even on a
non-privileged port. 127.0.0.1 is obviously not very useful.

-----

3. I can't try connecting directly to the server process running on a bgl
login node, because apparently I'm not allowed to create a socket:

chad@login2:~/src/commtest> ./daemon
wrappers.c: Socket(): socket(): Operation not permitted

-----

4. I wanted to try talking to an existing TCP service on a login node, but
echo, telnet, and http are all blocked.

-----

If anybody has any suggestions for something else I can try, please let me
know.

Thanks,
ccg

- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.









- -------------------------------------------------------------------- To add or remove yourself from this mailing list, use the 'notifyme' command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.