[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openais
Subject:    [Openais] Announcing Corosync 1.3.1 available at ftp.corosync.org!
From:       Steven Dake <sdake () redhat ! com>
Date:       2011-04-25 3:43:13
Message-ID: 4DB4EDD1.4090700 () redhat ! com
[Download RAW message or body]

I am pleased to announce a maintenance release of Corosync 1.3.1.
This maintenance release resolves many defects found through extensive
field deployments, the use of our automated testing toolset, and feature
development of SNMP and other features.

There are preliminary reports this version works on align-only access
hardware such as PPC and ARM processors.

The code is available via our FTP server at:
ftp://ftp.corosync.org


============================================================
The full list of changes
============================================================

2011-04-14  Tim Serong  <tserong@novell.com>

	Add ipc_refcnt to message_handler_req_{exec, lib}_cfg_ringreenable()jj
	Without refcounting the conn pointer here, corosync will segfault
	if one kills a running instance of "corosync-cfgtool -r" (rhbz#695191)

	Reviewed-by: Steven Dake <sdake@redhat.com>

	Fix tyop in RRP faulty error messages
	Reviewed-by: Russell Bryant <russell@russellbryant.net>

2011-04-14  Steven Dake  <sdake@redhat.com>

	Align ipc on 8 byte boundaries
	Align all ipc messages on 8 byte boundaries.  This alignment will
remove bus
	errors on systems that can't access non-byte aligned data and should
improve
	performance.

	Reviewed-by: Angus Salkeld <asalkeld@redhat.com>

	Fix problem where unaligned totemip address access would result in bus
error on non-unaligned-safe architectures.
	Reviewed-by: Angus Salkeld <asalkeld@redhat.com>

2011-04-14  Greg Walton  <corosync@gwalton.net>

	Clean up ENDIAN ifdef tests
	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-04-12  Angus Salkeld  <asalkeld@redhat.com>

	IPC: place calls to stats functions outside of mutexes
	This is to prevent nasty deadlocks between IPC and objdb.

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-04-12  Zane Bitter  <zane.bitter@gmail.com>

	Provide better checking of the message type
	A negative value for the message type (on systems where char is signed)
	would cause a crash. This is highly probable if the cluster is, for
example,
	misconfigured to have encryption enabled on some nodes but not others.

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-03-29  Angus Salkeld  <asalkeld@redhat.com>

	Fix shutdown when a confdb client is still connected
	If you are connected to corosync and registered for
	object notifications then corosync is asked to shutdown
	the IPC server will get stuck. This is because the pipe
	is closed and the refcount is increased. This leaves ipcs
	with a connection that it can't destroy.

	Solution:
	1) if a write to the pipe fails (pipe closed) decrement the refcounter.
	2) fix the object_track_stop() - it was not working as the functions
	   did not match up. (this caused the late callbacks).
	3) in ipcs call exit_fn() then stats_destroy_connection() so that
	   the service engine can have time to call object_track_stop()
	   before the object gets destroyed.

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-03-24  Angus Salkeld  <asalkeld@redhat.com>

	confdb: send notifications from the main thread not IPC thread
	corosync-notifyd has exposed an issue with confdb notifications.

	The normal state of affairs is:
	IPC thread > lock > objdb > lock

	objdb notification whilst really useful turn things around:
	<middle of big call chain>
	objdb > lock > confdb > ipc > lock

	This reverse ordering of locks causes a horrible dead lock.

	I see this patch as a work around until corosync-2.0
	when most of the threads and locking disappear.

	This patch adds a pipe to confdb service. When we get a
	objdb notification a struct gets written to the pipe.
	The poll loop then runs the dispatch in the main thread.
	In the dispatch we call the real ipc_dispatch_send().

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-03-24  Steven Dake  <sdake@redhat.com>

	totemsrp: free messages originated in recovery rather then rely on
messages_free
	Relying on messages_free may seem like it should work, but it leads to a
	situation where every node has released the messages, yet some nodes think
	messages are missing.  The output then looks like "Retransmit: #" in
	repitition.  This patch frees those messages immediately during the
transition
	to the OPERATIONAL state and sets the internal variables totemsrp depends
	upon to the proper values.

	Reviewed-by: Jan Friesse <jfriesse@redhat.com>

	totemsrp: Only restore old ring id information one time
	The current code stores the current ring information every time a commit
	token is generated.  This causes the old ring id used for comparison
purposes
	to increase if a token is lost in commit or recovery, resulting in
failure of
	totem.  This patch changes the behavior to only store the old ring id one
	time when the commit token is received, and then further commit token ring
	id saves are not done until OPERATIONAL is reached.

	Reviewed-by: Jan Friesse <jfriesse@redhat.com>

	totemsrp: Remove recv_flush code
	The recv_flush code is no longer necessary because of the miss_count_count
	addition.  It can in some cases lead to register corruption because of
	interactions with -fstack-protector, the recursive nature of how this code
	works, and interactions with the optimizer in some versions of gcc.

	Reviewed-by: Jan Friesse <jfriesse@redhat.com>

2011-03-21  Steven Dake  <sdake@redhat.com>

	Resolve abort during simulatenous stopping of atleast 4 nodes
	consider 5 nodes.

	node 3,4 stopped (by random stopping) node 1,2,5 form new configuration
	and during recovery node 1 and node 2 are stopped (via service service
	corosync stop).  This causes 5 never to finish recovery within the timeout
	period, triggering a token loss in recovery.  Bug #623176 resolved an
assert
	which happens because the full ring id was being restored.  The resolution
	to Bug #623176 was to not restore the full ring id, and instead operate
	(according to specifications) the new ring id.  Unfortunately this exposes
	a problem whereby the restarting of nodes 1-4 generate the same ring id.
	This ring id gets to the recovery failed node 5 which is now in gather,
	and triggers a condition not accounted for in the original totem
specification.

	It appears later work from Dr. Agarwal's PHD dissertation considers this
	scenario.  That solution entails rejecting the regular token in the above
	condition.  Since the ring id is also used to make decisions for commit
token
	acceptance, we must also take care to reject the regular token in all cases
	after transitioning from OPERATIONAL.

	Reviewed-by: Jan Friesse <jfriesse@redhat.com>

2011-03-07  Steven Dake  <sdake@redhat.com>

	Fix abort when token is lost in RECOVERY state
	A commit token should be rejected when a token is lost in the recovery
	state.  This occurs naturally because the ring id increases by 4 for
	every new ring.  Prior to this patch, if the token was lost, the old
	ring id information was restored, causing a commit token to be accepted
	when it should be rejected.  This erronously accepted commit token would
	lead to an assertion which is fixed by this patch.

	Reviewed-by: Angus Salkeld <asalkeld@redhat.com>

2011-02-25  Steven Dake  <sdake@redhat.com>

	Don't assert when ring id file is less then 8 bytes
	If the ring id file for the processor is less then 8 bytes, totemsrp would
	assert.  Our speculation is that this condition happens during a fencing
	operation or local filesystem corruption.

	With this patch, Corosync will create fresh ring id file data when the
	incorrect number of bytes are read from the ring id.

	Amend to use sizeof the strerror string length and PATH_MAX for the
path length.

	Reviewed-by: Angus Salkeld <asalkeld@redhat.com>

2011-02-25  Jan Friesse  <jfriesse@redhat.com>

	objdb: destroy all handles in _clear_object
	Patch replaces free for object_instance with handle_destroy to remove
	leaks in handles (and also memory leak).

	Reviewed-by: Steven Dake <sdake@redhat.com>

	Iterate all items in object_reload_notification
	Reviewed-by: Steven Dake <sdake@redhat.com>

	corosync-fplay: use uint32_t and remove bit-shift
	The flight recorder records all data in 32 bit words. Use uint32_t type
	rather then unsigned int. Also remove bit-shift with multiply by sizeof
	uint32_t.

	Reviewed-by: Steven Dake <sdake@redhat.com>

	corosync-fplay: Use size_t length mod in printf
	Reviewed-by: Steven Dake <sdake@redhat.com>

	corosync-fplay: handle too large rec_size
	Corrupted files may contain items with rec_size larger then g_record
	buffer and/or flt_data_size.

	Also g_record array size is now defined as constant.

	Reviewed-by: Steven Dake <sdake@redhat.com>

	logsys: Properly lock flt data before dump
	Data needs to be locked, otherwise resulting fdata file may be
	incorrect.

	Reviewed-by: Steven Dake <sdake@redhat.com>

	logsys: Don't leak fd on successful fdata dump
	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-02-25  Russell Bryant  <russell@russellbryant.net>

	Add calls to pthread_attr_destroy().
	This patch adds a couple of missing calls to pthread_attr_destroy().

	There were a couple of instances where pthread_attr_init() was being
	used without a cooresponding call to pthread_attr_destroy().  This also
	localizes the pthread_attr_t to the function where it is needed instead
	of having it persist (the man page specifically states that destroying
	the attributes structure has no effect on threads created using the
	attributes).

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-02-25  Angus Salkeld  <asalkeld@redhat.com>

	CONFDB: fix parent_get response id
	Reviewed-by: Seven Dake <sdake@redhat.com>

	STATS: fix key name length on "join_count"
	Reviewed-by: Seven Dake <sdake@redhat.com>

	STATS: increase the space for application names
	Reviewed-by: Seven Dake <sdake@redhat.com>

	remove unused function declaration
	Reviewed-by: Steven Dake <sdake@redhat.com>

	fix timersub warning on freebsd
	Make them all protected by #ifndef timersub

	Reviewed-by: Steven Dake <sdake@redhat.com>

2011-02-25  Steven Dake  <sdake@redhat.com>

	Handle delayed multicast packets that occur with switches
	Some switches delay multicast packets vs the unicast token.  This patch
works
	around that problem by providing a new tuneable called
miss_count_const.  This
	tuneable works by counting the number of times a message is found missing
	and once reaching the const value, marks it as missing in the
retransmit list.

	This improves performance and doesn't display warning messages about missed
	multicast messages when operating in these switching environments.

	Reviewed-by: Angus Salkeld <asalkeld@redhat.com>

2011-02-25  Angus Salkeld  <asalkeld@redhat.com>

	CPG: make sure coroipcc_service_disconnect() is always called.
	This prevents a shared mem leak if corosync dies while clients
	are connected.

	Calling cpg_finalize() did not release the shared mem as
	coroipcc_msg_send_reply_receive() returned an error and
	thus coroipcc_service_disconnect() did not get called.

	Reviewed-by: Steven Dake <sdake@redhat.com>

	IPC: send failure message to client if memory maps fail
	Reviewed-by: Steven Dake <sdake@redhat.com>
_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic