2008/01/30

Netgear GS724 - Cannot ping

In an industrial plant we're designed and delivered the L1 and L2 networks (in this case L1 and L2 accordingly to ISA 95 stack not networl ISO stack!).

We've used Netgear GS724 switches monitoring their service level with Paessler Ipcheck (http://www.paessler.com/ipcheck)  which uses IGMP to poll them every minute.

Randomly and very often the switches use to crash but They still provided connectivity to the linked devices, you could ping them but not the switch itself.

After spending a lot of time trying and changing a lot of configurations, we received a new release of the firmware (an alpha version) from Netgear so don't waste time checking configurations!

2008/01/17

Spanning-Tree Tuning

In these days I'm working with a colleague (system engineer) in steel-maker plant doing an assessment of the physical network.

I'm learning a lot of things and the most important one is that network-tuning is hard!

I never considered that when a network topology is planned you must avoid cycling-path and every path must be closed (graph theory)

Diagramma di una rete di media complessità, notare l'assenza di percorsi ciclici

The trade-off is to outcome high-reliable networks, it's mandatory to provide redundacy over links or paths so cycling-path are required and cannot be avoided!

Spanning-Tree algorithm is a protocol supported by bridges and switches used to avoid cycling-path on a rendundat network, and it works automatically disabling links which will causes cycling-path

In our case we've a double-ring fiber-optic network topology provided using Netgear 7212 (28 switches installed) and GSM 724 (13 switches installed).

  • 7212 are the nodes on the fiber-optic network
  • 724 are the switches to connect the PCs and Servers and they're are always connected to the 7212 (which made up the backbone)

We turned on Spanning-Tree (abbreviated to SPT) on every 7212 but quite often we get a Network Topology Change which means the SPT has discovered a broken-link and it's changing the favourite paths on each other 7212.

This means that network will slow-down for a couple of minutes (using 7212).

Looking for the causes, we discovered:

  • A fiber-optic link that is not reliable and sometimes it's seen as broken by the 7212 causing a Topology Change.
  • A 7212 (not on the ring) was flickering (every 2 minutes) causing others network topology change.

Lesson-Learned

  • Turn-off SPT on switches outside the ring
    You cannot have cycling-paths on those connections but if for some reason the links will broken-up the SPT will force a network topology change to every other switches on the ring!
     SPT
    The previous diagram is our case, on red switches  (outside the ring) the SPT must be switched-off.
  • In SPT define the cost for each link
    • Spend time to find the less reliable links because whene they're used, on every communication error they will cause a topology change.
    • Give the top cost to these links
  • Filter the SPT traffic when different networks are connected
    • On production plants usually there are different networks (installed at different times by different vendors) that must be connected (supervisions, chemical analysis, backoffice, etc.). Pay attention when you connect the networks to filter the traffic and if it's feasible use a L3 switch.

2008/01/15

Global Policy - Annotations from Belgium

  • With Vista there is not push of policies and you need to do gpupdate.exe or reboot to apply the policies
  • If you change a user policy with logoff-login doesn't apply the policy. You must Reboot or gpupdate
  • All about Terminal Server auto logon: http://blogs.msdn.com/ts/archive/2007/01/22/vista-remote-desktop-connection-authentication-faq.aspx
    • You need to "call" each client almost 1-time, using mstsc to connect to server providing the password.
    • I don't know how to propagate the credenting for a single sign-on. Probably it's feasible with WinSRV 2008
  • To enable ping on Vista (by global policy): 

 ScreenShot001

Global Policy - Autologin Admin Template

For Level 2 SCADA workstation I need that clients performs an autologin. To control the autologin from a Windows Server global policy:

http://www2.truman.edu/~whowd/blog/2005/10/group-policy-auto-logon-administrative.html

Put AutoAdminLogon and ForceAutoLogon to "1".

2008/01/09

Astoria

Yesterday I came out from the customer plant (in Belgium) at 11:00PM and I spent the time before going to sleep looking to a very interesting webcast from Pablo Castro on Astoria (2007 SAF Forum: http://msdn2.microsoft.com/en-us/architecture/bb267380.aspx).

The technology is very attractive but in real world I see a range of application just when youo don't care about performances and locking issues or when the data-sources are mainly read-only.

For example it's very powerfullin a scenario in which you could have an element manager (industrial plants, TLC network devices, etc.) on an RDBMS and you need to combine these reference information with other data-sources (i.e. WMI, SNMP, statistic etc.). Astoria it's a strong tool to publish and integrate etherogeneus data, making them easy to be consumed by supporting tools or SLA dashboard in a very easy way, decoupling the consumer from the business service. It's strong in scenarios when you need to continuosly update quick and smart tools as business requirements changes (OBA ???).

HP Power Management

I'm not a system engineer so I hate to perform infrastructure configuration... so that's post it's for my remark only.

We're delivering a project in an industrial plant in Belgium (L1 & L2 integration, L2 Tracking, Reporting, BAM) with small server farm (7 HP DL and a 2 nodes cluster).

The servers are connected to an HP R5500 XR UPS.
  • The UPS guarantees 22 mins of uptime (that's a very short time elapsed I think and it's good just for flickering lines)
  • The UPS must be connected to a Management Server (where you've to install HP Power Manager suite).
  • With HP Power Manager the rule is 1 UPS, 1 Management Server

To setup HP Power Manager in the farm, the steps are:

  • Connect the UPS to the choosed management server and install Power Manager suite (I've used version 4.2).
    During setup you must provide information on:
    • how the UPS is connected (serial, UPS, network)
    • an IP/Port where to provision the management site (the prerequirement is having IIS already deployed)
  • Install the HP Remote Agent on the servers you want to stay connected with UPS.
    HP Remote Agent is in charge to shutdown gracefully the servers when the management server receives a notification of battery low from the UPS (and it forward this message to all the connected agents).
    I've used HP Remote Agent V 4.0 which is the lates version and it works with Power Manager 4.2.
  • Enter the new Power Management administration site on the management server (the default account/pwd should be admin/admin).
  • Test the battery to tune the internal management software data ("Battery Test" from left menu from the administration site)
  • Enlist the Attached Devices which defines the scope of management. Each specified device could be:
    • Management Server: The server phisically connected to UPS and it could be only 1 in the topology
    • Remote Agent: The servers connected Management Server through Remote Agents
    • Other: Other devices on the switch which cannot be monitored and gracefully shutdown (i.e. switches)
  • It's very important to provide to each Attached Device the parameter Shut Down OS which is used by the software to evaluate the shutdown time basing on the estimate battery charge.
  • The last step is to assign each Attached Device to a Load Segment:
    • In my deployment I've go 2 Load Segments. A Load Segment defines an atomic set of servers which are managed by the Power Manager as a single unit because they are phisically connected to a single electric line. So if you perform a shutdown on a Load Segment, all the servers on the segment will be shutdowned.
    • In my topology:
      • on Load Segment 1 there is only the management server which should be the last server to be switched off and the database.
      • on Load Segment 2 there the remaining servers
    • You must pay attention where connect the Attached Device to a Load Segment controlling that the server power-wire is connected to the right ups plug-in.
  • You can control the shutdown policy from Power Fail menu option (on the left). I choose:

    • For Load Segment 1: Run Until Battery Depletion which means that HP Power Manager should evaluate the remaining battery charge and shut down accordingly the management server.
    • For Load Segment 2: Maximize Runtime which means that the servers which will be shutdowned with the management server. It's like a dependency betwen Load Segment 1 and Load Segment 2.