Escience:Anthropology Cyber-Infrastructure Workshop

From SASciWikid

Contents

Proposal for a Cyber-infrastructure development workshop for Anthropology and related Social Sciences

Proposal


Requirements

Consider anthropology's needs in cyberinfrastructure and to address the following questions:

  1. what do we need in order to make it possible for anthropologists to contribute to cyber-based, systematic data collection, data analysis, and data sharing?
  2. what is available now and what would it cost to support the use of this technology to advance the mission of anthropology?
  3. making sure that good technology is available for anthropologists in the field (satellite-linked downloads and uploads, for example)
  4. comparative studies at home (databases of languages and cultures that recognize each other and build on one another)

Anthropology requirements

  1. Pervasive computing in the field, the lab and the classroom
  2. Asychronous and low speed (glacial) E-science
  3. Resource sharing/pooling
  4. Sharing knowledge about methods and techniques
  5. Sharing tools
  6. Flexible parallel multiple-ontology meta-data
  7. sorry this is so boring at this stage

Criteria and mission

The proposal should set out the proposed purpose of the Workshop, and the outcomes and outputs that the proposers hope to achieve, including capacity building and any longer-term collaboration that might be foreseen. It should provide a list of the proposed Workshop participants with short curricula vitae. It should indicate the proposed organization, including venue, timing (up to 31 March 2007), programme and related matters, and a clear indication of the costs which are sought, including travel, accommodation, catering, and administrative arrangements.

Establish collaboration on global or large-scale problems that require efforts of international research and education projects in the area of cyber-infrastructure, including the development of the next generation of cybertools applied to data collections in the social and behavioral sciences and underpinning research issues.

Proposals will be evaluated by NSF and ESRC on the basis of their usual criteria (see guidance notes attached), along with the additional criteria of: international excellence, scope and importance of the problem; mutual scientific benefits and impact; organisation and planning; contribution to capacity development through international experience for students and researchers early in their career.

Background

The focal relevance of E-Science to Anthropology is not the speed, storage etc. assoicated with grid infrastructure (those these are very important features), but rather in pervasive computing; and pervasive in the most pervasive sense at that. Having (with suitable authorisation) access to data and applications distributed over a large number of physical systems available as if any portal ‘was’ the virtual system created by the grid is a very powerful resource. Not only will E-Science provide the speed and large scale storage required for many more ambitious projects in social research, but that power anywhere and anytime, using data that can be distributed across the community of researchers. Publications on E-Science media might include their underlying data, applications for processing this data and the contextual data in a form that can be directly used by the audience of the publication. In combination with other technologies, such as the development of wireless networking, the information environment research can be carried out in will be pervasive in the office, home and the field. There is a lot of work to be done before this vision can be realised, but it does look obtainable if a number of social issues can be resolved. The most important bridges to cross are how to formulate ethnical practice for data sharing, and how to reshape the increasing restrictions that are arising from the current inflexibility of use of intellectual property, so called Digital Rights Managment (DRM), which legislation in both North America and Europe are allowing to be dominated by the entertainment industry rather than social benefit.

One could argue that a lot of the technological aspects of E-Science development are outside the remit of social scientists. To an extent that is true, especially issues such as funding of the infrastructure and the development of many key technologies. However, the history of social science computing indicates that there are many many computational methods that have arisen directly from the social science community which probably have never have arisen from outside simply because these involve addressing problems that are if not unique to social science before the fact (often these social science technologies are found to be useful to others after the fact), involve a level of control that is not seen as interesting to others.

A case in point is the development of qualitative research tools in general. To my knowledge all of the current ‘useful’ tools were developed by social scientists or humanities scholars (with the possible exception of some ‘free text’ database tools), despite a massive amount of development of text based software outside of these domains. But these tools are usually shaped to deal with areas of interests outside social science and humanities. So we have countless editors for program code, indexers for program code, comparison programs for program code and documentation for same, which may be usable by others, but are certainly not optimised for the task.

The social science community and its funders must be involved in some software development, at least reference applications. Many of the reasons that underly general policy for NOT supporting software development, while once well founded, are no longer valid. One of the points of writing for an E-Science context is that if certain principles are followed, the issues of compatibility no longer apply ... even code that is written for a specific specimen platform is not a barrier so long as the E-Science grid concerned includes the specimen platform among its resources. So guidelines would need to be followed, but the reuse value of software in the E-Science world is very different from the single-user world.

The harnessing of ‘open source’ software should be a priority for funders. Not only does this resolve many issues of intellectual property restrictions, the vagaries of depending on commercial development (and the continuted success of companies lest they fail and remove their product from the market), but increases the support resources. There are many talented programmers around the world that will contribute to such projects if enough funding is made available to initiate the initial management structure required for a successful open source project.

Then there are projects such as ours funded by the NSF, ESPRC and the ESRC to develop cyber-capabilities and grid-middleware. Using a linguistic analogy, the grid concept currently simply creates a ‘universal morphology’ for computation. It does not specify the syntax or semantics. If we assume that social scientists are concerned with representing and managing information in ways that differ from other domains, than it is up to the social science community to define and implement the means for doing so. Our project is a beginning for this process, but certainly will not be the end. This is a challenge in which we as social scientists will be joined by others, since everyone will have some of the same management issues (current concepts such as ‘files’ are not very condusive to E-Science since in practice the contents of a file should consist of ‘addressable’ data.), some of these are bound to originate in the social sciences, who are often in a better position to understand how resources interact with each other in a social environment.

Even in the case of ‘bleeding edge’, or even ‘standing on the edge’ applications, such as quantum computing, the social science funders should encourage at least collaboration at some stages of development. Quantum computing will revolutionise our ability to do social modelling and analysis once it is sufficently developed. It is important that we are in the queue to influence how this technology develops as it moves towards implementation.