Bugs Fixed in vFabric GemFire 7.0.0
Last updated: 10/25/2012
| Id | Created | Title | Description | Workaround for earlier gemfire versions | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #42058 | 06/14/10 | Input/output error when creating a disk store on NFS mount | We have observed that when persisting to an NFS mount on redhat 5 we occasionally see this error when creating the persistent store: java.io.IOException: Input/output error. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42264 | 08/11/10 | Connections continue to be closed even when socket-lease-time="0" | Connections continue to be closed even when socket-lease-time="0" | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42343 | 09/17/10 | The PartitionedRegionStats instance name is too long | The PartitionedRegionStats instance name is "Partitioned Region " + fullRegionName + " Statistics". It should just be fullRegionName. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42382 | 10/01/10 | Index support on overflow region. | Application can create indexes on overflow region, given that the index expressions satisfies the compact-range index requirements. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42388 | 10/04/10 | Nested Query | TypeMismatchException is thrown if alias is used in a nested query and returned results from nested query is being used in WHERE clause of top-level enclosing query. | Aliases should not be used in a nested query if returned results from nested query is being used in WHERE clause of top-level enclosing query. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42415 | 10/18/10 | Querying with indexes | This bug has been fixed in 7.0. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42429 | 10/24/10 | In rare cases a member crashing during initial bucket creation can result in a hang | In rare cases, if the member that is the primary of bucket that is being created crashes during bucket creation, the other members may fail to choose a new primary, resulting in hangs when trying to update that bucket. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42433 | 10/25/10 | A getAll done from a client will not update that last access time of the entries it reads on the server | A getAll done from a client will not update that last access time of the entries it reads on the server. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42457 | 11/11/10 | Concurrent update and query on a Region entry | After changing the internal locking scheme for indexes in 6.6.2 version Gemfire could return wrong query result if an index is used in result evaluation and a concurrent update is happening on region containing the index. The region entry in question could have it's respective field changed in the update, while index being used for query is created on the same field. This has been fixed in 7.0 version without any effect on performance on query or update. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42510 | 12/09/10 | GatewyHub with 'primary' StartupPolicy can start as secondary if other GatewayHub is started with 'none' StartupPolicy | A GatewyHub configured with a primary StartupPolicy can start as secondary if another GatewayHub is started with a "none" StartupPolicy. If the GatewayHub configured with "none" comes first, it becomes primary and does not relinquish primary status to the configured 'primary' GatewayHub. | Start the 'primary' GatewayHub first, so that it can obtain the primary lock. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #42790 | 02/17/11 | isOriginRemote inconsistent for transactional loads | For a GemFire transaction hosted remotely on behalf of a peer member, the value of isOriginRemote flag is inconsistent when a region.get results in a loader invocation. The value of isOriginRemote reported by various callbacks is as follows: cacheWriter: false cacheListener: true TransactionWriter: true TransactionListener: true | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43153 | 04/14/11 | Admin API or JMX Agent may fail if any GemFire member cannot locate its gemfire.jar | GemFire attempts to find its gemfire.jar for response to monitoring attempts by the Admin API and JMX Agent. The following three locations are searched: 1) getProtectionDomain().getCodeSource().getLocation() 2) Searches "java.class.path" for gemfire.jar 3) Searches "sun.boot.class.path" for gemfire.jar If a JVM or container environment does not return a URL in #1 that can be used to open a stream, then the Admin API or JMX Agent may hang or fail if the gemfire.jar cannot be found on either the "java.class.path" or "sun.boot.class.path". | Place the gemfire.jar on "java.class.path" or "sun.boot.class.path". | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43351 | 05/10/11 | Some GemFire APIs use Serializable | Some of the GemFire APIs require that an instance of Serializable be passed or returned. If you wish to use an object that implements PdxSerializable with these APIs then you will also need to implement Serializable. And any classes serialized by a DataSerializer or a PdxSerializer will also need to implement Serializable to work with these APIs. The GemFire serialization takes precedence over Serializable so you still get the benefit of it. The APIs that use Serializable are: ResultCollector, ResultSender, Execution.withArgs, and PartitionResolver.getRoutingObject. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43466 | 05/25/11 | Transaction commit may hang | In a very busy system with, transaction commit may starve when there are multiple threads trying to begin transactions on regions with expiration. This has been fixed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43614 | 06/24/11 | Query results with null values are throwing NPE while querying between partitioned region and client-server. | NPE was seen when query results containing null values are sent between partitioned region nodes or between client server. The changes are made such that null values are sent and received without any issues. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43621 | 06/27/11 | Index initialization | During index initialization (either after restart of a node or index created later when region is already populated with some entries), index entries contained UNDEFINED value while index-key is NOT UNDEFINED. It is not possible in any case that a valid (NON-UNDEFINED) key is pointing to an UNDEFINED value. Fixed in 7.0 release. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43705 | 07/19/11 | The CPU usage for the JMX agent can be high | In certain situations, the JMX Agent uses too much CPU for a sustained period of time when GFMon is connected to it. | Set the JMX Agent's refreshInterval to 60 seconds and GFMon's Refresh Interval to 60000 ms. This causes periods of JMX Agent inactivity followed by a period of CPU usage rather than constant high CPU usage. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43715 | 07/20/11 | NOT operator with LIKE predicate in an OQL query | NOT operator is ineffective if used with LIKE predicate as in following query, Select * from /exampleRegion r where NOT (r.firstname LIKE "a%") | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43812 | 08/15/11 | ShutdownAll a starting up member might hang | When the starting up member is waiting for another member with new data, it will wait forever. However, that member with newer data might has been closed by the same ShutdownAll request. In that case, we should let the starting up member not to block waiting. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43828 | 08/18/11 | Query support for COUNT(*) aggregate function | COUNT queries (For example: SELECT COUNT(*) from /regionName <where clause>) are now supported by the GemFire Query Engine. | In older versions of GemFire, you can use (SELECT * FROM /region).size with replicated regions. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43829 | 08/18/11 | Breaks backward compatibility to 6.5 | If gemfire users define any string like "${user-string}" in cache.xml it will be replaced as "user-string" before getting used in Gemfire cache. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43835 | 08/19/11 | Arrays of a domain class that is PDX serialized may cause exceptions when read-serialized is true or doing a query | When doing a query or if read-serialized is true, then an array of domain classes that is serialized as a PDX may cause exceptions. The exceptions can be ClassNotFoundException or a failure to assign an element to the array. PdxInstanceImpl.hashCode or PdxInstanceImpl.equals will be in the call stack. | Change the array type to Object[] in your code. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43838 | 08/19/11 | PdxInstance.getField will always deserialize all nested arrays as Object[] | PdxInstance.getField will deserialize all arrays in that field's value as an Object[]. The only exception to this is an array nested in an object that used standard java.io.serialization. This can cause a ClassCastException. For example if the field had a nested DataSerializable that also had a nested array, then the DataSerializable would try to cast the result of readObjectArray to its array type. This attempt would fail because the array was deserialized as an Object[]. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43847 | 08/23/11 | Queued CqEvents are delivered using the application thread | The CqEvents that are queued during executeWithInitialResults (the cq events originated during the time result set is sent to client) are delivered to client (CqListeners) before the ResultSet is returned and the CqEvents are delivered in the same application thread rather than the separate thread. This is addressed by making sure the CqEvents are processed through call back thread rather than application thread. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43860 | 08/25/11 | network LinuxSystemStats values are wrong | The network related LinuxSystemStats have values that are wrong. The very first sample recorded for these is correct but each subsequent sample increases too much. So you can see delta between sample constantly in the gigabyte range. The stats that have this bug are: recvBytes, recvPackets, recvErrors, recvDrops, xmitBytes, xmitPackets, xmitErrors, xmitDrops, xmitCollisions. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43879 | 09/02/11 | Cache XML encoding | All XML encodings including UTF-8 (or equivalent) are supported and parse-able. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43884 | 09/06/11 | NullPointerException starting locator with the "-properties=" option | Attempting to start a locator with the "gemfire" command and specifying a "-properties=" option will result in a NullPointerException and the locator will not be started. This is caused by an error in parsing and processing the command line. You can work around the problem by setting a system property on the command line, such as gemfire start-locator -port=15963 -properties=/export/dune1/gemfire/gemfire.properties -Dabcd=abcd The system property can be anything you like. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43888 | 09/06/11 | suspect verification sometimes waits 2 x member-timeout | The final step of kicking a failed node out of the system sends an ARE_YOU_DEAD message to that node in a UDP unicast packet. It is then supposed to give member-timeout milliseconds for the node to respond. If the node doesn't respond it is kicked out of the system. A mistake in the time-interval calculations for this process makes this final step sometimes wait nearly twice the member-timeout period before declaring the node dead. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43898 | 09/12/11 | Query with OR clause and indexes. | Results for OR condition in an OQL query are correct now with indexes available for both conditions of OR clause. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43900 | 09/12/11 | Instantiators may not be recovered from disk if the class is not available | Instantiators and DataSerializers are persisted as part of a disk store. If the instantiator or DataSerializer class is not available to the classloader when the disk store is created (during cache creation, usually), the instaniator or data serializer will not be recovered. Later, this could result in a deserialization error if the instantiator has not been loaded by other means. | Always explicitly register instantiators and data serializers in all members, for example by using a serialization-registration in the cache.xml. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43910 | 09/15/11 | 6.5 disk store files may fail with NullPointerException during a 6.6 offline compaction | If you use GemFire 6.6 to do an offline compaction of a disk store that was created with GemFire 6.5, it may fail with a NullPointerException. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43936 | 09/22/11 | Some cache operations may fail during PDX auto serialization due to a CacheClosedException | Some cache operations may fail during PDX auto serialization because of a CacheClosedException. The CacheCloseException is ignored in certain cases, but a bug in the auto serializer caused it not to be ignored. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43959 | 09/28/11 | cacheserver start throws a NullPointerException when invalid cache server.ser file is left behind from previous Cache Server process. | The cacheserver command shell script invokes the com.gemstone.gemfire.internal.cache.CacheServerLauncher Java class to start a GemFire Cache Distributed Member as a server. It is entirely possible for the .cacheserver.ser file to be left behind, for instance after a Cache Server crashes. It is also possible that the .cacheserver.ser file gets corrupted. Either way, the CacheServerLauncher class incorrectly handles the case when the file remains assuming that it will always be read correctly. When it cannot be read, the readStatus method and spinReadStatus method will return a null reference causing the check on the Status object's 'state' to throw a NullPointerException. In this situation, the check should just assume that the Cache Server process is no longer running and delete the existing .cacheserver.ser file. This is the behavior implemented by the fix. | Manually delete the .cacheserver.ser file in the working directory of the Cache Server process and invoke cache server start again. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43978 | 10/05/11 | GatewayEventListener sees events on PdxTypes region | If portable data serialization (PDX) is used in a system with a GatewayEventListener installed, the GatewayEventListener receives events on a region called PdxTypes. This is an internal metadata region that should not be exposed to the user callbacks. | Ignore events on this PdxTypes region in the GatewayEventListener | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43979 | 10/05/11 | GatewayEventListener.getDeserializedValue returns serialized bytes | If the there is an error deserializing the value when calling GatewayEvent.getDeserializedValue, the serialized bytes will be returned, instead of throwing an error. | Fix the underlying deserialization error. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43987 | 10/07/11 | FailedSynchronizationException when cache is started closed rapidly | When a cache is created and shutdown in rapid succession (e.g. unit testing) one may get a FailedSynchronizationException due to race in transaction manager initialization. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #43993 | 10/10/11 | Eviction may fail with IllegalArgumentException | Eviction may fail with an IllegalArgumentException that says 'Must not serialize NOT AVAILABLE in this context'. | no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44158 | 11/11/11 | Redundancy zones for HA are random in presence of teredo link-local address fe80:0:0:0:0:100:7f:fffe on Windows | The presence of non unique, link-local IPv6 addresses may cause gemfire to think that two different hosts are actually the same machine. One such case is windows machines that use Teredo may use link-local address fe80:0:0:0:0:100:7f:fffe on every machine. If gemfire decides that two hosts are actually the same machine, it may refuse to create redundant copies of buckets on those hosts if enforce-unique-hosts is set to true. With enforce-unique-hosts set to false, gemfire may be unable to distinguish between two JVMs that actually are on the same host and two JVMs that are on different hosts, and therefore may end up placing two redundant copies of a bucket on the same host. This is fixed in GemFire 7.0 and 6.6.3.5. | Upgrade to GemFire 7.0 or 6.6.3.5. The alternative is to disable the teredo service on every Windows machine hosting GemFire servers using this command: "netsh interface teredo set state disabled" | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44418 | 01/23/12 | CustomExpiry does not allow expiration on an existing entry to be made earlier | When using CustomExpiry instances your implementation may return an initial ExpirationAttributes that set the expiration for an entry at some point far in the future. Then on an update or access of that entry your CustomExpiry would return an expiration at an earlier point in time. However because the entry has already been scheduled the second call to CustomExpiry does not happen and your entry's expiration is still at the initial time it was originally scheduled for. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44436 | 01/25/12 | Implicit method invocation is not supported on PDX objects | When the query engine needs to fetch a value from an object, it first looks for the public attribute by that name (symbol). If the name is not found, it creates an implicit getter method "getSymbol()" and tries to see if the method is available. If available, it fetches the value by invoking the method call. The implicit method invocation is not supported on PDX types. This is fixed in 7.0. | Upgrade to 7.0. For prior versions, in order to invoke an object's method during querying, you must call the method explicitly within the query string. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44625 | 03/05/12 | DiskAccessException in one peer causes a failure in another peer | When updating a persistent partitioned region, if one member receives a DiskAccessException when writing to disk (for example, if the disk is full), their is a chance that one of the peers which did not receive the DiskAccessException may also close the region and bridge server. | Ensure that sufficient space is available for gemfire persistent files. If this situation is encountered, fix the space issues and restart the failed members. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44649 | 03/08/12 | Large transactions initiated by client can cause OOME | GemFire server keeps last 1000 client initiated transactions in memory to deal with HA scenarios. If client initiated transactions contain a large number of operations, then the server may run out of memory and throw an out-of-memory exception (OOME). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44716 | 03/21/12 | Calling executeWithInitialResults concurrently multiple times may cause NullPointerException | If multiple calls to executeWithInitialResults are made, there is a potential timing issue where a NullPointerException will result, along with missing data. | Try not to make multiple concurrent executeWithInitialResults calls as it is not recommended practice | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44727 | 03/22/12 | Deadlock closing the cache while recovering a persistent partitioned region | In rare cases, calling cache.close, or using shutdown all, can cause members to hang during shutdown if the members where in the process of recovering a persistent partitioned region from disk. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44818 | 04/03/12 | Colocated PRs Querying | Colocation between two Partitioned Regions is defined one-way, i.e. if PR1 is defined as colocated with PR2, PR2.isColocated(PR1) is not true, ONLY vice-versa is true. Query on colocated regions should check both ways as it is OK for queries to have PRs colocated in anyway. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44844 | 04/09/12 | Distributed deadlock between Index Creation and disk store recovery | This will happen when partitioned region with index starts recovering its data from persistent file that may not be recent/newest one. In the node "Node-1" While recovering partitioned data-set (buckets), if its sees there is another node "Node-2" in the distributed system that has the latest data then it starts waiting for "Node-2" to host that bucket set, meanwhile the "Node-2" could have sent index creation message to "Node-1" and will be waiting for a response, this will end-up deadlock between "Node-1" waiting for "Node-2" to host the data-set, and "Node-2" waiting for "Node-1" to create the index. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44856 | 04/11/12 | ConcurrentMap methods throw UnsupportedOperationException on LOCAL regions | The ConcurrentMap methods throw an UnsupportedOperationException on regions with LOCAL scope. | If the region is created in a member that does not have any peers then give it a distributed scope like DISTRIBUTED_NO_ACK. Since it has no peer to distributed to this will allow you to do ConcurrentMap methods without any actual distribution happening. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44857 | 04/11/12 | member is kicked out later than it should be, apparently 2x the member-timeout interval | In a situation where two server caches were having trouble they each requested suspect verification to be performed on the other. The membership coordinator initiated processing for both at about the same time but completion of verification for the first was done in 5002 milliseconds while the second took 10003. They should both have been completed in about 5000 milliseconds, the member-timeout interval for the distributed system. This can cause blocked operations to take longer to recover when there is a crash. | there is no workaround | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44873 | 04/13/12 | Stale NFS file handle exception was found while using NFS for disk stores. | If using NFS or vMotion for disk store, it might hit this Stale NFS file handle exception. This is a NFS issue. The root cause is a FileNotFoundException: Host A tried to recover from disk files located at host B via NFS. Calling FileOutputStream?() throws FileNotFoundException. javadoc says: * If the file exists but is a directory rather than a regular file, does * not exist but cannot be created, or cannot be opened for any other * reason then a <code>FileNotFoundException?</code> is thrown. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44897 | 04/16/12 | Durable CQ events are lost when regions and pools are defined by cache.xml | Continuous queries on durable clients cannot be used when pools and regions are defined by using cache.xml. When cache.xml is used to define regions and pools, there is an implicit call to readyForEvents() that begins the processing and sending of events after cache.xml processing is complete. However, any continuous queries or cache listeners that are not defined in cache.xml will not get the event updates. | If you wish to use continuous queries on durable clients, you should define regions and pools using API calls. Do not use cache.xml. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44914 | 04/17/12 | PDX must use the DEFAULT disk store if keys are PDX-enabled and persistent | If PDX is configured with a disk store name then startup may fail with a PdxInitializationException exception. | Do not specify a disk-store name. Instead just set persistent to true and use the default disk store. If you want to configure the default disk store you can by using the name "DEFAULT". | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44924 | 04/18/12 | JMX Agent didn't read configuration in gfsecurity.properties | JMX Agent now reads the SSL configuration from gfsecurity.properties | Specify the ssl/security configuration as system properties while starting the JMX Agent. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44925 | 04/18/12 | Occasional deadlock between agent and cache servers | A deadlock was observed occasionally between agent and cache servers when starting many servers simultaneously. This issue has been addressed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44927 | 04/18/12 | Gfsh consumed one client license per server region | In prior versions, gfsh consumed one client license per server region. It created a pool (which corresponded to a client license on the server) per region. This is a problem for customers with limited client licenses. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44930 | 04/18/12 | hang during network failure with processes attempting to contact lost members | When there is a network failure the membership coordinator attempts to contact members that do not respond to membership view changes. These attempts have a member-timeout limit set, but this timeout is not being respected by the Java TCP/IP sockets. This causes the system to delay sending out membership changes and causes operations that should be distributed to the lost members to hang until the coordinator gives up trying to contact the lost members. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44939 | 04/19/12 | Wrong exception thrown if transaction host departs | A TransactionDataNodeHasDepartedException was wrongly thrown rather than a TransactionInDoubtException during one HA scenario. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44941 | 04/19/12 | creation of invalid entry causes NullPointerException in remote WAN site | It is possible to use create(K,null) or a similar operation with a null value to create an "invalid" entry in GemFire. If this entry is transmitted over a WAN gateway it can cause problems on the receiving side if a ConcurrentMap operation is later performed on the same entry. This can cause the latter operation to hang. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44967 | 04/24/12 | deadlock between membership view management and surprise member handling | It is possible for the product to hang when processing a new membership view if a concurrent attempt is made to connect to the process by a shunned member. When this happens there will be a thread in this state: "P2P message reader@107a05c": at com.gemstone.org.jgroups.protocols.pbcast.GMS.determineCoordinator(GMS.java:1127) - waiting to lock <0xe4a511c8> (a com.gemstone.org.jgroups.Membership) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager.requestMemberRemoval(JGroupMembershipManager.java:2552) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager.addSurpriseMember(JGroupMembershipManager.java:1779) - locked <0xe46f2330> (a com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager$ViewLock) at com.gemstone.gemfire.internal.tcp.Connection.setRemoteAddr(Connection.java:1126) at com.gemstone.gemfire.internal.tcp.Connection.processNIOBuffer(Connection.java:3690) at com.gemstone.gemfire.internal.tcp.Connection.runNioReader(Connection.java:1739) at com.gemstone.gemfire.internal.tcp.Connection.run(Connection.java:1620) at java.lang.Thread.run(Thread.java:662) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44989 | 05/01/12 | Repeatedly opening and closing a cache from xml containing a cache server caused memory leak | In prior versions, repeatedly opening and closing a cache from xml containing a cache server caused memory leak of cache instances. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44991 | 05/02/12 | Product shuts down with ForcedDisconnectException when other members are unable to form tcp/ip connections quickly enough | It is possible for the product to shut down a member if that member is not accepting tcp/ip connections quickly enough. Prior to the 6.6.2 release the JDK's timeout mechanism on tcp/ip connection formation was not functioning as documented. The 6.6.2 release of GemFire added an alternative mechanism for timing out these connection attempts that has been found to be too aggressive, being set at 3 times the member-timeout interval. The fix for this increases the timeout to 6 times the member-timeout interval. This problem can be worked around by increasing the member-timeout interval of the distributed system. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #44995 | 05/03/12 | hang during shutdown with concurrent surprise member processing | It is possible for the product to deadlock during shutdown if a shunned member is concurrently trying to connect to it. One thread will have this stack trace: "P2P message reader@8739bf" daemon prio=10 tid=0x082a5000 nid=0x75b6 in Object.wait() [0xdecfb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:485) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.waitDisconnected(InternalDistributedSystem.java:1172) - locked <0xe49f71e0> (a java.lang.Object) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1239) - locked <0xe4b6b720> (a com.gemstone.gemfire.distributed.internal.InternalDistributedSystem) - locked <0xe0fd6748> (a java.lang.Class for com.gemstone.gemfire.internal.cache.GemFireCacheImpl) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:899) at com.gemstone.gemfire.distributed.internal.DistributionManager$MyListener.membershipFailure(DistributionManager.java:4391) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager.requestMemberRemoval(JGroupMembershipManager.java:2595) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager.addSurpriseMember(JGroupMembershipManager.java:1779) - locked <0xe4b3d578> (a com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager$ViewLock) at com.gemstone.gemfire.internal.tcp.Connection.setRemoteAddr(Connection.java:1126) at com.gemstone.gemfire.internal.tcp.Connection.processNIOBuffer(Connection.java:3690) at com.gemstone.gemfire.internal.tcp.Connection.runNioReader(Connection.java:1739) at com.gemstone.gemfire.internal.tcp.Connection.run(Connection.java:1620) at java.lang.Thread.run(Thread.java:662) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45001 | 05/04/12 | Race condition may cause NPE when accessing CQ stats | The window for this exception to be thrown is quite small, however there is a small chance an NPE will be thrown on either the server side or client side when accessing a cq's stats while it is being registered. | None | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45040 | 05/18/12 | Hang after running out of disk space | In rare cases, when a member with persistent or overflow regions runs out of disk space or encounters other disk IO errors, the member can hang trying to close the regions rather than close the affected regions gracefully. | Try to avoid running out of disk space. If this situation is encountered, kill the member which threw a DiskAccessException. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45051 | 05/21/12 | Possible missing results when executing a limit query with AND clauses with specific indexes used | When an index is used for a limit query with an AND clause, it is possible that the number of results returned is lower than the limit, even though there are enough entries to fulfill the limit. An example query that could cause this issue: SELECT * FROM /PORTFOLIO_REGION P, P.POSITIONS.VALUES POS WHERE P.ID > 5 AND POS.SECID = 'VMW' LIMIT 5 Assuming there is an index on POS.SECID | It may be possible to rewrite the query, such as SELECT * FROM (SELECT * FROM /PORTFOLIO_REGION P, P.POSITIONS.VALUES POS WHERE POS.SECID = 'IBM') S WHERE S.ID > 5 LIMIT 5 Or SELECT * FROM /PORTFOLIO_REGION P, P.POSITIONS.VALUES POS WHERE POS.SECID IN (SELECT POS.SECID FROM /PORTFOLIO_REGION P, P.POSITIONS.VALUES POS WHERE POS.SECID = 'VMW') AND P.ID > 5 LIMIT 5 If the index on POS.SECID could be removed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45053 | 05/21/12 | Durable client setting fails if set to maximum integer value | The durability of client on the server is based on the durable-client-timeout value. By default, it is set to 300 seconds, and the maximum value for the property has been defined as the maximum integer value. However, due to a type conversion issue, the maximum value that can actually be set for the property is 2147483. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45077 | 05/24/12 | Spurious DistributedSystemDisconnectedException with cause IllegalStateException | A cache operation may, in rare circumstances, throw a DistributedSystemDisconnectedException when the system is not disconnected. This may occur when a peer process has disconnected or crashed. The cause of the exception will be an IllegalStateException in this form: {{{ Caused by: java.lang.IllegalStateException: Task already scheduled or cancelled at java.util.Timer.sched(Timer.java:401) at java.util.Timer.scheduleAtFixedRate(Timer.java:328) at com.gemstone.gemfire.internal.SystemTimer.scheduleAtFixedRate(SystemTimer.java:386) at com.gemstone.gemfire.internal.tcp.ConnectionTable.scheduleIdleTimeout(ConnectionTable.java:555) ... 31 more }}} | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45104 | 06/01/12 | shutdownAll hangs with mix of 662 and 663 members | In 6.6.3 and after, we fixed a bug to re-order the partition regions to close. So the order of closing PR in 6.6.3+ will be different with 6.6.2. This will end up with ShutdownAll operation hanging in mixed cluster of 6.6.2 and 6.6.3. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45117 | 06/05/12 | Creating a subregion requires the use of deprecated apis | Currently no API exists that lets you create a subregion using ClientRegionFactory or RegionFactory. The createSubregion methods on Region all take the old RegionAttributes which is created using the deprecated AttributesFactory. | Use the deprecated AttributesFactory | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45131 | 06/07/12 | Nested In Queries may cause undefined values to be returned | Using a nested In Query may cause the results to all be undefined. A query such as SELECT P2.ID FROM /portfolios2 P2 where P2.ID in (SELECT P1.ID from /portfolios1 P where P.someValue >= 500L and P.someValue < 1000L) With indexes on both P1.ID and P2.ID along with an index on P.someValue, will cause the incorrect results to be returned | To work around this issue, we can break up the query into multiple queries and use bind parameters or possibly remove an index. Rewriting the query might work under the correct circumstances | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45132 | 06/07/12 | Nested In Query with And condition within a range may cause ClassCastException | Somewhat related to bug #45131 Creating 2 regions, /portfolios1 P and /portfolios2 P2 and creating indexes on P.ID, P2.ID and P.someValue. Executing the following queries in the specified order will result in ClassCastException. First execute "SELECT P.ID FROM /portfolios1 P WHERE P.someValue >= 500L AND P.someValue < 1000L" and pass the results into the bind parameter of "SELECT P FROM /portfolios1 P WHERE P.ID IN(SELECT P2.ID FROM /portfolios2 P2 where P2.ID in ($1)) and P.someValue >=500L and P.someValue < 1000L" | We can break up the query into 3 parts instead of 2 and pass in the results as bind parameters to the next query. The queries would be: "SELECT P.ID FROM /portfolios1 P where P.someValue >= 500L AND P.someValue <1000L"; "SELECT P2.ID FROM /portfolios2 P2 where P2.ID in ($1)"; "SELECT P.ID FROM /portfolios1 P where P.ID in ($1) AND P.someValue >=500L and P.someValue < 1000L"; | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45142 | 06/08/12 | SystemConnectExceptions thrown by new servers after shutting down locator | It is possible for the membership to become confused and not elect a new membership coordinator when a the old one is shut down. This results in new server-side processes throwing a SystemConnectException when attempting to start up. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45153 | 06/11/12 | Hang in backup with concurrent operations | In rare cases, performing a backup on a system with many concurrent operations can cause a hang. | Perform the backup during periods of inactivity. If a backup is performed on a system with heavy activity and the system hangs, killing the command line process that initiated the backup will fix the hang. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45164 | 06/12/12 | Iteration over Region.entrySet() on | Iteration over Region.entrySet() on a partitioned region may produce a null entry if that entry was destroyed concurrently. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45309 | 06/29/12 | Recovery of a PARTITION_PERSISTENT_OVERFLOW region can be slow | If a member that is hosting a persistent partitioned region with a large amount of data that has overflowed to disk, recovery may be slow. This delay is most noticeable when the amount of overflow data exceeds the OS page cache. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45343 | 07/05/12 | Updates not distributes to WAN sites | When the WAN currency level is greater than one, some updates from putAll operations may not be distributed to remotes sites. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45397 | 07/11/12 | Region entry expiration time is persisted now. | Before this fix, the recovered entry will recalculate its expiration time from the recover time. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45667 | 08/07/12 | afterRegionDestroyed event in CacheListener | After a Gemfire member forcefully disconnected from the distributed system, afterRegionDestroyed event in listener used to have Operation type as CACHE_CLOSE but it should have been FORCE_DISCONNECT. Fixed in 7.0. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45886 | 08/22/12 | Javadocs for RegionShortcut.REPLICATE_PERSISTENT are incorrect | The javadocs for RegionShortcut.REPLICATE_PERSISTENT are incorrect. They say it configures heap LRU and overflow to disk. The correct javadocs follow: * A REPLICATE_PERSISTENT has local state that is kept in sync with all other replicate * regions that exist in its peers. * In addition its state is written to disk and recovered from disk when the region * is created. * The actual RegionAttributes for a REPLICATE_PERSISTENT region set the {@link DataPolicy} to {@link DataPolicy#PERSISTENT_REPLICATE} and {@link Scope} to {@link Scope#DISTRIBUTED_ACK}. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45927 | 08/23/12 | GemFire installs incorrectly | When the product is to be installed in a directory with a name which contains non-ascii characters, the name input but the user may become garbled and the incorrect directory will be created. This is known to occur on Windows systems when running the installer from cmd.exe. | On Windows systems, ensure that the codepage being used is 1252. This can be checked using the 'chcp' command. To set the codepage, execute 'chcp windows-1252'. This will only change the codepage for that session and will not effect any system-wide settings. Alternatively, start the installer with the java system property gemfire.installer.directory set to the name of the directory to be created. For example java -Dgemfire.installer.directory=Grüße installer.jar | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45931 | 08/23/12 | If customer did not specify classpath for their instantiators, offline compaction (including the conversion) will lose the instantiators. | This bug is inherited from 6.5. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45943 | 08/24/12 | javadocs on ClientCacheFactory and PoolFactory incorrect | The javadocs that describe when an IllegalArgumentException are incorrect on the following methods. The corrections are listed in this note: ClientCacheFactory#setPoolIdleTimeout(long) PoolFactory#setIdleTimeout(long) correction: IllegalArgumentException - if idleTimout is less than -1. ClientCacheFactory#setPoolPingInterval(long) PoolFactory#setPingInterval(long) correction: IllegalArgumentException - if pingInterval is less than or equal to 0. ClientCacheFactory#setPoolReadTimeout(int) PoolFactory#setReadTimeout(int) correction: IllegalArgumentException - if timeout is less than 0. ClientCacheFactory#setPoolRetryAttempts(int) correction: IllegalArgumentException - if idleTimout is less than -1. | Ignore the incorrect javadocs and read the corrections in the note. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #45983 | 08/28/12 | GemFire fails to start | Under Windows, GemFire may fail to start if any file related properties, such as cache-xml-file or log-file, contain non-ASCII characters in the path. | Ensure that all paths, referenced in gemfire.properties, only contain ASCII characters. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #46004 | 08/30/12 | NullPointer Exception while Querying partitioned regions with numThreads system property. | When query is exeucted on Partitioned Region with "gemfire.PRQueryProcessor.numThreads" system property set, the query used to throw NPE. The NPE was getting thrown in the log messages. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #46172 | 09/13/12 | Deadlock while registering instantiators | Disabling the deserialization class cache with -Dgemfire.loadClassOnEveryDeserialization=true will prevent this deadlock but will also reduce your deserialization performance. Instead of registering your instantiators in static intializers that get run when the class is loaded you can register them with with api calls early in your application's life or you can register them in your cache.xml file using an instantiator element. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| #46456 | 10/02/12 | processes slow to shut down after network failure | A process that becomes isolated with one or more other processes due to a network failure may be slow to shut down or not shut down at all. This has been observed when one of the isolated processes was a locator. The locator shut down normally but other processes either did not shut down at all or did not do so until the network was fixed. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||