-
Notifications
You must be signed in to change notification settings - Fork 2
infosystems
Your research computing experience will involve using and developing information systems. We will take a quick look at the various components, types, and development models of these systems.
The primary components of an information system[1] are hardware, software, data, and people — the most important component of all! Why? Because systems are designed and built by people for people. If people don’t use them, or they do not serve the people’s needs, then they are worthless! Today we will take a closer look at how information systems are designed to help us.
The physical machinery of a computer system is called its hardware. Of course, this means the computer itself, its chassis and the parts inside it, including its core integrated circuit known as the central processing unit (CPU), as well as its memory, called "RAM" or Random Access Memory, and any internal storage devices like hard disk drives (HDD) and solid state devices (SSD).
Accessories or peripherals are the devices you plug into the computer, mostly for input and output.
Networking equipment includes all of the devices that allow your computer to communicate with other systems. Examples are the network cables and the boxes they connect to, such as routers, switches, hubs, wireless access points, and modems.
Software is the name for the instructions we give to computing devices to tell them what to do. Software is "soft" because the instructions are not physical entities like hardware devices. The instructions may be stored on physical media like a hard disk or USB thumbrive, just as a cooking recipe may be written on a piece of paper or printed in a book. However, the recipe itself is just a conceptual model of how to perform a task. Likewise, a software program is essentially just a list of instructions (or a logical model that issues instructions) for the execution of a set of desired computing operations.
As you use a computer, the software instructions that are executed on your behalf by the CPU, such as programs and apps, are called application software. Applications are the programs that serve a specific purpose for a computer user or are to be used for completing certain tasks, such as exploring the Internet, editing a text document, or working with data.
Applications run within a overall software environment called the operating system (OS).
Notable examples are the familiar Microsoft Windows, OS X, iOS, Android and Linux operating systems.
An operating system also has a kernel, which is the central software program that manages the data exchange between the CPU and the other components within a computer. The kernel communicates with those components using device drivers, which are small programs that provide a software interface to the hardware. Devices that contain integrated circuits of their own may store software in firmware that allows updates through a procedure called flashing. The computing system will also contain utility software such as configuration and management tools, plus shared software libraries used by both applications and system software.
Data refers to all of the information in the system. It may be stored as raw (unprocessed) values, or may be in the form of summary tables, plots, written documents, photographs, music, videos, or just about any other form of information which can be digitized. Data can be at rest, in which case it will probably be saved in some sort of file or may just be occupying some bits of system memory. Data may also be in motion, flowing between the components within a single computer system or between nodes of a network. As the boundaries of an information system will usually extend beyond computer systems, data may also reside on scraps of paper or may only exist in a person’s mind in the form of a thought or idea, waiting to be communicated to the rest of the information system.
People are an integral part of the system. We design it, build it, use it, maintain it, and adapt it to new uses. Our systems should serve us, not the other way around. Every aspect of the system should be designed to serve people’s needs optimally. But our needs vary, and so we need various types of systems.
We may use several types of information systems each day. Let’s take a quick look at a few of the most common types.[2]
Most of us are very familiar with search information systems like web search engines, such as Google Search, but many sites use domain-specific search engines like PubMed.
Spatial information systems in the form of Geographic information system (GIS) have become increasingly important in recent years. ArcGIS has dominated this field, with the free and open QGIS gaining in popularity.
Global information systems (GLIS) are those either developed or used in a global context. Public health examples include global health databases such as the UNHCR Statistics & Operational Data Portals and the WHO’s Global Health Observatory (GHO).
Enterprise systems are comprehensive organization-wide applications used for Enterprise Resource Planning (ERP).
Expert systems support such specialty domains as diagnosis, forecasting, and delivery scheduling. They use artificial intelligence to apply knowledge and reasoning in order to solve complex problems.[3]
Office automation systems refer to systems which support the everyday business operations of an organization. Business Process Automation (BPA) uses these systems to improve efficiency by streamlining routine activities.
Personal information systems help people manage their individual communications, calendaring, note-taking, diet, and fitness.
Developers undertake the software development process using several different approaches. Let’s take a look at a few of the most popular models.
Here is a short list of common development models.[4] We have provided links from each of these to relevant Wikipedia pages. You are encouraged to read more about them. We’ll just go through the list quickly to give you a rough idea of the differences between them.
The Systems development life cycle (SDLC) is the classic model. It involves lots of up-front planning and is risk averse.
Waterfall development is another and "old school" favorite, It’s like the SDLC but does not offer any sort of feedback loop.
Prototyping is a useful technique for many models. Good when a small-scale experiment can prove an idea without risking heavy investment.
Iterative and incremental development might evoke the image of "baby steps" or the notion of "try, try, again". There is a central loop, between initial planning and final deployment, which repeats as needed. Like prototyping, it is a technique which can be used in other models.
Likewise, Spiral development is meant to address evolving requirements through cycles of repeated analysis and design, getting closer and closer to the desired product. The idea is that the entire process is repeated over and over until you are finally satisfied.
Rapid application development (RAD) focuses on development more than up-front planning.
Agile development is a more evolved form of RAD, with more of a focus on user engagement, and gaining wide popularity.
Code and fix sounds like what it is — cowboy coding — what most lone programmers do, and what might seem most familiar to you as a scientific researcher. This can be quick for easy projects, but can be very inefficient and expensive for larger projects, due to insufficient planning.
They are all useful methods, though, some more generally than others. The approach you take should depend upon your situation.
We’ll look more closely at three of these right now.
Since information systems are so complex, it is very helpful to follow a standard development model to make sure you take care of all of the little details without missing any.
For years, the standard development model was known as the SDLC, or Systems Development Life Cycle.[5] It works well for large, complex, expensive projects, but can be scaled down as needed. Many of its phases are used in the other models as well. Let’s take a quick look at them.
Systems development life cycle (SDLC) phases:
-
-
There is a focus on careful planning before any design or coding takes place. The feasibility study explores your options and gaining approval from stakeholders.
-
-
-
Analysis includes a detailed study of the current system and clearly identifying requirements before designing a new system.
-
-
-
Once you have thoroughly defined the requirements, you can begin to model the new system.
-
-
-
Implementation is where the hardware assembly, software coding, testing, and deployment takes place.
-
-
-
Maintenance may sound boring, but it is essential to ensure that the project is an overall success.
-
The main idea is that systems development is a cycle — a continual process. You need to allow for maintenance, updates, and new features. The use and upkeep of the system provides feedback which goes into planning the next version.
We will spend more time on the SDLC and its early phases in a separate module.
A related model is the Waterfall model.[6] It has basically same same steps as the SDLC, but visualizes them as cascading stair-steps instead of a circle.
It’s basically similar to the SDLC, but without the feedback loop. There are cascading stair-steps, where one phase leads to another and the output of one phase becomes sthe input of another. Its came from manufactoring where after-the-fact changes are expensive or impossible.
The Agile model is a newer, but very popular, especially among smaller teams within budding organizations. Hallmarks of this model include methods such as pair programming, test-driven development, and frequent product releases.[7]
Smaller teams that can meet regularly, ideally face-to-face. Working in pairs, with one person coding and other helping "over the shoulder". After you identify use cases, then you write tests and then build the system to pass the tests. By developing an automated test and build system, releases can be pushed out quickly and more often.
Information systems vary in the openness of the their implementations, in terms of both interoperability standards and specific design details.[8]
You can have open systems (and standards, source), where the technical specifications are publicly available.[9] Different organizations may implement them in their own way, yet still maintain interoperability with other implementations.
Or systems may be closed, or proprietary, where an organization keeps the details to itself, making it more difficult for competitors to inter-operate. While this may provide a competitive advantage for the producer it contributes to what is called vendor lock-in, where a consumer becomes dependent on the vendor, unable to switch to another due to the high costs and disruption.
These interoperability aspects will include file formats, communications protocols, security and encryption.
All of those are important when you are collaborating, sharing data and files with others, who might be using different platforms.
By using transparent systems, you not only increase your ease of communication and collaboration, you also contribute to openness in a broader, social context.
Information transparency supports openness in:
We have provided links to several popular movements which are working to increase openness and transparency in various aspects of society. You are encouraged to spend some time learning about these trends.
So, if you want the benefits of openness in your work and more freedom to make changes, consider building your information infrastructure with open technologies.
As an example, we have assembled a transparent information system to create and support this course.
We have developed the course transparently, using an open content review process where students, staff and faculty look at the materials and evaluate them to determine whether or not they best meet the course goals.
We have an open content license, the Creative Commons Attribution Share-Alike CC BY-SA 4.0 International license.
We have open development where our source is freely and publicly available on GitHub).
We are using open file formats (Markdown, HTML, CSS, PNG, AsciiDoc, PDF), open source tools tools (RStudio, Git, Redmine, Canvas, Linux, Bash) and open communications protocol standards (HTTP/HTTPS).
As you take part in this course, and provide feedback which will go toward improving it, we thank you for contributing!
We hope that this brief overview of Information Systems has given you a more clear picture of the what they are and how they are built.
For more information, please read the related sections in the Computing Basics Wiki, particularly, the pages on hardware and software.
In the next module, we will take a closer look at requirements gathering and systems analysis, two of the most important topics of this course.
The latest version of this document is online at: https://github.com/brianhigh/research-computing/wiki Copyright © The Research Computing Team. This information is provided for educational purposes only. See LICENSE for more information. Creative Commons Attribution 4.0 International Public License.